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(57) Abstract 

A previously isolated hepatitis B virus (HBV) integration in a 147 bp cellular DNA fragment linked to hepatocellu- 
lar carcinoma (HCC) was used as a probe to clone the corresponding complementary DNA from a human liver cDNA lib- 
rary. Nucleotide sequence analysis revealed that the overall structure of the cellular gene, which has been named hap, is si- 
milar to that of the DNA-binding hormone receptors. Six out of seven hepatoma and hepatoma-derived cell-lines express 
a 2.5 kb hap mRNA species which is undetectable in normal adult and fetal livers, but present in all non-hepatic tissues an- 
alyzed Low stringency hybridization experiments revealed the existence of hap related genes in the human genome. The 
cloned DNA sequence is useful in the preparation of pure hap protein and as a probe in the detection and isolation of 
complementary DNA and RNA sequences. The hap protein is a retinoic acid (RA) receptor identified as RAR-p. 
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A HOVEL STEROID/THYROID HORMONE RECEPTOR- 
RELATED GENE, WHICH IS INAPPROPRIATELY 
EXPRESSED IN HUMAN HEPTOCELLOLAR CARCINOMA, 
AND WHICH IS A RETINOIC ACID RECEPTOR 



BACKGROUND OF THE INVENTION 
This invention relates to nucleotide sequences, polypeptides 
encoded by the nucleot ide sequences, * and to their use in diagnos- 
tic and pharmaceutical applications. 

Primary hepatocellular carcinoma (HCC) represents the most 
common cancer, especially in young men, in many parts of the 
world (as in China and in much of Asia and Africa) (reviewed in 
Tiollais et al., 1985)* Its etiology was investigated mostly by 
epidemiological studies, which revealed that, beyond some minor 
potential agents such as aflatoxin and sex steroids, hormones. 
Hepatitis B virus (KBV) chronic infection could account for a 
large- fraction of liver cancers (Beasley and Hwang, 1984). 
HBV ONA has been found to be integrated in the genome of 
most cases of HCCs studied (Earn an et al., 1980; Brechot et al.,. 
1980; Chakraborty et al., 1980; Chen et al., 1982)* Nonetheless 
the role of those sequences in liver oncogenesis remains unclear. 
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A single HBV integration in a HCC sample in a short liver 
cell sequence has been reported recently. The sequence was found 
to be homologous to steroid receptor genes and to. the cellular 
proto-oncogene c-erbA (Dejean et al., 1986K 

Ligand-dependent transcriptional activators, such as steroid 
or thyroid hormone receptors, have recently been cloned allowing 
rapid progress- in the understanding of their mechanism of action. 
' Nevertheless, there exists a need in the art for the identifica- 
tion of transcripts that may encode for activational elements, 
such as nuclear surface receptors, that may play a role in 
hepatocellular carcinoma. Such findings would aid in identifying 
corresponding transcripts in susceptible individuals. In addi- 
tion, identification of transcripts could aid in elucidating the 
mechanisms by which HCC occurs. 

Retinoids, a class of compounds including retinol (vitamin 
A), retinoic acid (RA) , and a series of natural and synthetic 
derivatives, exhibit striking effects on cell proliferation, dif- 
ferentiation, and pattern formation during development 
(Strickland and Mahdavi, 1978; Breitman et al., 1980; Roberts and 
Sporn, 1984; Thaller and Eichele, 1987). Until recently, the mo- 
lecular mechanism by which these compounds exert such potent 
effects was unknown, although retinoids were thought' to modify 
their target cells through a specific receptor. 

Except for the role of retinoids in vision, their mechanism 
of action is not well understood at the molecular level. Several 
possible mechanisms have been suggested. One hypothesis proposes 
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that retinoids are needed to serve as the lipid portion of 

giycolipid intermediates involved in certain, specific 
glycosylation reactions. Another mechanism, which may account 
| for the various effects of retinoids on target cells, is that 
they alter genomic expression in such cells. It has been sug- 
gested that retinoids may act in a manner analogous to that of 
the steroid hormones and that the intracellular binding proteins 
(cellular retinal-binding and retinoic acid-binding protein) play 
a critical part in facilitating the interaction of retinoids vith 
binding sites in the cell nucleus. 

For example, the observation that the RA-induced differenti- 
ation of murine P9 embryonal carcinoma cells is accompanied by 
the activation of specific genes has led to the proposal that RA, 
like the steroid and thyroid hormones, could exert its transcrip- 
tional control by binding to a nuclear receptor (Roberts and 
Spron, 1984). However, the biochemical characterization of this 
receptor had been hampered by high affinity RA-binding sites cor- 
responding to the cellular retinoic acid binding protein (CRAB?), 
which is thought to be a cytoplasmic shuttle for RA (Chytil and 
: Ong, 1984). 

In any event, retinoids are currently of interest in derma- 
tology. The search for hew retinoids has identified a number of 
compounds with a greatly increased therapeutic index as compared 
with naturally occurring retinoids. Extensive clinical testing 
of two of these retinoids, 13-cis-retinoic acid and the aromatic 
• analog etretinate, has lead to their clinical use in dermatology. 
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In addition, several lines of evidence suggest that important re- 
lations exist between retinoids and cancer. A number of major 
diseases, in addition to cancer # are characterized by excessive 
*: proliferation of cells, often with excessive accumulation of 
extracellular matrix material. These diseases include rheumatoid 

arthritis, psoriasis, idiopathic pulmonary fibrosis, scleroderma, 
• • • 

. and cirrhosis of the liv^r, as well as the disease process 
atherosclerosis. The possibility exists that retinoids, which 
can influence cell differentiation and proliferation, may be of 
therapeutic value in some of these proliferative diseases. There 
exists a need in the art for reagents and methods for carrying 
out studies of receptor expression and effector function to 
determine whether candidate drugs are agonists or antagonists of 

: retinoid activity in biological systems. 

There also exists a need in the art for identification of 
retinoic acid receptors and for sources of retinoic acid 
receptors in highly purified form. The availability of the 
purified receptor would make it possible to assay fluids for 
agonists and antagonists of the receptor. 

SUMMARY OF THE INVENT I OK 
This invention aids in fulfilling these needs in the art. 
More particularly, this invention provides a cloned DNA sequence 
encoding for a polypeptide of a newly identified cellular gene, 
vhifh.ljas been named hag. The DNA sequence has the formula shown 

in Fig. 2a, 2b, 2c, 2d successively (and collectively designated as Figure 2). 
More particularly, the sequence comprises : 
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atgtttgactgtatggatgttctgtcagtgagtcctgggcaaatcctgattctacactgcgagtcc 
:gtcttcctgcatgctccaggagaaagctctcaaagcatgcttcagtggattgacccaaaccgaatg 
gcagcatcggcacactgctcaatcaattgaaacacagagcagcagctctgaggaactcgtcccaag 
iccccccatctccacttcctcccccpcgagtgatcaaaccctgcttcgtctgccaggacaaatcatc 
agggtaccactatggggtcagcgcctgtgagggatgaagggctttttccgcagaagtattcagaag 
aatatgatttacacttgtcaccgagataagaactgtgttattaataaagtcaccaggaatcgatgc 
caatactgtcgactccagaagtgctttgaagtgggaatgtccaaagaatctgtcaggaatgacagg 
iiacaagaaaaagaaggagacttcgaagcaagaatgcacagagagctatgaaatgacagctgagttg 
gacgatctcacagagaagafccgaaaagctcaccaggaaactttcccttcactctcgcagctgggt. 
aaatacaccacgaattccagtgctgaccatcgagtccgactggacctgggcctctgggacaaattc 
agtgaactggccaccaagtgcattattaagatcgtggagtttgctaaacgtctgcctggtttjcact 
ggcttgaccatcgcagaccaaattaccctgctgaaggccgcctgcctggacatcctgattcttaga 
atttgcaccaggtataccccagaacaagacaccatgactttctcagacggccttaccctaaatcga 
actcagatgcacaatgctggatttggtcctctgactgaccttgtgttcacctttgccaaccagctc 
ctgcctttggaaatggatgacacagaaacaggccttctcagtgccatctgcttaatctgtggagac 
cgccaggaccttgaggaaccgacaaaagtagataagctacaagaaccattgctggaagcactaaaa- 
atttatatcagaaaaagacgacccagcaagcctcacatgtttccaaagatcttaatgaaaatcaca 
gatctccgtagcatcagtgctaaaggtgcagagcgtgtaattaccttgaaaatggaaattcctgga 
tcaatgccacctctcattcaagaaatgatggagaattctgaaggacatgaacccttgaccccaagt 
tcaagtgggaacacagcagagcacagtcctagcatctcacccagctcagtggaaaacagtggggtc 
agtcagtcaccactcgtgc aat aa . 

The invention also covers variants and fragments of the DNA se- 
quence. The DMA sequence is in a purified form. 

This invention also provides a probe consisting of a ra- 
dionuclide bonded to the DNA sequence of Che invention* 
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in addition, this invention provides a hybrid duplex mole- 
cule consisting essentially of the DNA sequence of the invention 
: hydrogen bonded to a nucleotide sequence of complementary base 
sequence, such as DNA. or RNA. 

Further, this invention provides a polypeptide comprising an 
amino acid sequence of hap. protein, wherein the polypepetide con- 
tains the amino acid sequence shown in Figure 2. More particu- 
' larly, the amino acid sequence comprises: 

MetPheAspCysMetAspValLeuSerValSerProGlyGlnlleLeuAspPheTyrThrAla 

SerProSerSerCysMetLeuGlnGluLysAlaLeuLysAlaCysPheSerGlyLeuThrGln 

ThrGluTrpGlnHisArgHisThrAlaGlnSerlleGluThrGlnSerThrSerSerGluGlu 

LeuVaiProSerProProSerProLeuProProProArgValTyrLysProCysPheValCys 

GlnAspLysSerSerGlyTyrHisTyrGlyValSerAlaCysGluGlyCysLysGlyPhePhe 

ArgArgSerrieGlnLysAsnMetlleTyrThrCysHisArgAspLysAsnCysVallleAsn 

LysValThrArgAsnArgCysGlnTyrCysArgLeuGlnLysCysPheGluValGlyMetSer 

LysGluSerValArgAsnAspArgAsnLysLysLysLysGluThrSerLysGlnGluCysThr 

GluSerTyrGluMetThrAlaGluLeuAspAspLeuThrGluLyslleArgLysAlaHisGln 

GluThrPheProSerLeuCysGlnLeuGlyLysTyrThrThrAsnSerSerAlaAspHisArg 

.• ValArgLeuAspLeuGlyLeuTrpAspLysPheSerGluLeuAlaThrLysCysI lelleLys 
neValGluPheAlaLysArgLeuProGlyPheThrGlyLeuThrHeAlaAspGlnrieThr 
LeuLeuLysAlaAlaCysLeuAspIleLeuIleLeuArglleCysThrArgTyrThrProGlu 
GlnAspThrMetThrPheSerAspGlyLeuThrLeuAsnArgThrGlnMetHisAsnAlaGly 
PheGlyProLeuThrAspLeuValPheThrPheAlaAsnGlnLeuLeuProLeuGluMetAsp 
AspThrGluThrGlyLeuLeuSerAlat leCysLeuI leCysGlyAspArgGlnAspLeuGlu 
i GiuProThrLysValAspLysLeuGlnGluProLeuLeuGluAlaLeuLysI leTyr I leArg 
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LysXrgArgProSerLysProHisMetPheProLysIieLeuMetLyslleThrAspLeuArg 
'SerlieSerAiaLysGlyXlaGluArgVallleThrLeuLysMetGluIleProGLySerMet 
.ProProLeuIlcGlnGluMetMetGluAsnSerGluGlyHisGluProLeuThrProSerSer 
^SerGlyAsnThrAiaGluHisSerProSerlleSerProSerSerValGluAsnSerGlyyal 

SerGlnSerProLeuValGln . 

The invention also covers serotypic variants of the polypeptide 
and fragments of the polypeptide. The polypeptide is free from 
human serum proteins, virus, viral proteins, human tissue, and 
human tissue components. Preferably, the polypeptide is free 
from human, blood-derived protein. 

The hap. protein (hap. for hepatoma) exhibits strong homology 
with the human ret inoic acid receptor (RAR) de The, K. , Marchio, 
A., Tiollais, P. t Dejean, A. Nature 330, 667-670 (1987), 
Petkovlch, M. , Brand, M.J., Krust. A. t Chambon, P. Mature 330, 
444-450 (1987), a receptor has been recently characterized 
Petkovlch, M., Brand, N.J., Krust, A. 6 Chambon, P. Nature. 330, 
444-450 (1987), Giguere, V., Ong, E.S., Segul, P. 4 Evans, R. M. 

' Kature 330, 624-629 (1987). To test the possibility that the hap. 

• protein might also be a retinoid receptor, a chimaeric receptor 
vas created by replacing the putative ONA binding domain of hap. 
vith that of the human oestrogen receptor (ER). The resulting 
fcajj-ER chlmaera was then tested for its ability to trans-activate 
an oestrogen-responsive reporter gene (vit-tk-CAT) in the pres- 
ence of possible receptor ligands. It vas discovered that 
retlnolc acid (RA) at physiological concentrations is effective 



CIIESCTHTS 1TST CUCSTT 



WO 89/05854 



PCT/EP88/01180 

<9 



in inducing the expression of this reporter gene by the hao-SR 
chimaeric receptor. See Mature, 332:850-853 (1988). This demon- 
strates the existence of two human retinoic acid receptors, desig- 
nated RAR-a and RAR-8. 

More particularly, it has been discovered that the hap. pro- 
tein is a second retinoic acid receptor. Thus, the expression 
•hag protein" is used interchangeably herein vith the abbrevia- 
tion "RAR-8" for the second human retinoic acid receptor. 

Also, this invention provides a process for selecting a 
nucleotide sequence coding for hap. protein or a portion thereof 
from a group of nucleotide sequences comprising the step of 
determining which of the nucleotide sequences hybridizes to a OKA 
sequence of the invention. The nucleotide sequence can be a OKA 
sequence or an RNA sequence. The process can include the step of 
detecting a label on the nucleotide sequence. 

Still further, this invention provides a recombinant vector 
comprising lambda-NM1149 having an EcoRI restriction endonuclease 
site into which has been inserted the DNA sequence of the inven- 
tion. The invention also provides plasmid pCOD20, which com- 
prises the DNA sequence of the invention. 

This invention provides an Ej. coli bacterial culture in a 
purified form, wherein the culture comprises E. coli cells con- 
taining DNA, wherein a portion of the DNA comprises the DNA se- 
quence of the invention. Preferably, the E^ coli is stain TG-i. 

In addition, this invention provides a method of using the 
purified retinoic acid receptor of the invention for assaying a 
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medium, such as a fluid, for the presence of an agonist or antag- 
onist of the receptor. In general, the method comprises provid- 
ing a known concnetrat ion of a proteiaceous receptor of the 
j invention, incubating the receptor with a ligand of the receptor 
and a suspected agonist or antagonist under conditions sufficient 
to form a ligand-receptor complex, and assaying for; 
•ligand-receptor complex or for free ligand or for non-complex 
receptor. The assay can be conveniently carried out using la- 
belled reagents as more fully described hereinafter, and conven- 
tional techniques based on nucleic acid hybridization, 
immunochemistry, and chromotograph, such as TLC, HPLC, and affin- 
ity chromotography. 

In another method of the invention, a medium is assayed for 
stimulation of trasncript ion of the RAR-8 gene or translation of 
the gene by an agonist or antagonist. For example, 8-receptor 
binding retinoids can be screened in this manner. 

BRIEF DESCRIPTION OF THE DRAWINGS 
This invention will be described in greater detail with ref- 
erence to the drawings in which 

Fig. 1 is a restriction map of human liver hap cDNA; 
.Fig, 2 (2a, 2b, 2c » 2d)is the nucleotide sequence of human liver hap cONA 

and a predicted amino acid sequence of human liver hap cDNA; 

Fig. 3 depicts the distribution of hafi mRNA in different 
tissues as determined by Northern blot analysis; 

Fig. 4 depicts the distribution of hajg mRNA in HCC and HCC 
derived cell lines as determined by Northern blot analysis; 
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Fig. 5 is a fluorograph of hag polypeptide synthesized in 
vitro and 'isolated on SDS-polyacrylamide gel; 

Fig. 6 shows the alignment of hap translated amino acid se- 
quence with several known sequences for thyroid and steroid hor- 
mone receptors; 

Fig.* 7 is a schematic alignment of similar regions identi- 
fied as A/B, C, D, and E of the amino acid sequences of Fig. 6; 

Fig. 8 depicts hap related genes in vertebrates (A) and in 
humans (B and C) as determined by Southern blot analysis; 

Fig. 9 shows the tissue distribution of RAR a and 6 tran- 
scripts; 

Fig. 10 shows the dose- and time-response of RAR a and 8 
transcripts after retinoic acid treatment of PLC/PRF/5 cells; 

Fig. 11 shows the effect of RNA and protein synthesis inhib- 
itors on the levels of RAR a and B mRMAs; 

Fig. 12 reports the results of nuclear run-on analysis of 
RAR 8 gene transcription after RA treatment; and 

Fig. 13 reports the results of nuclear run-on analysis of 
RAR 8 transcription in two hepatoma cell-lines; 

Fig. 14 shows the resulting kinetic analysis of RAR mRNA 
degradation; 

Fig. 15 depicts a nucleotide sequence, analysis extending a 
13 RAR- 8 by 72 'bp; and 
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Pig. 16 is a complete restriction map of a cloned 
Hindlll-BaraHi genomic DNA insert containing the nucleotide se- 
quence of Pig. 15. 

DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS 

A. IDENTIFICATION OF A PROTEIN, NAMED hap. PROTEIN, HAVING 
OKA- BINDING AND LIGAND-8INDING DOMAINS, AND 
IDENTIFIC ATION OF THE DNA SEQUENCE ENCODING hap PROTEIN 

As previously noted, ligand-dependent transcriptional 
activators, such as steroid or thyroid hormone receptors, have 
recently been cloned. The primary structure and expression of a 
new gene, hap, closely related to steroid or thyroid hormone 
receptor genes have now been discovered. The hap. product 
exhibits two regions highly homologous to the conserved DNA- and 
hormone-binding domains of previously cloned receptors. 

i 

More particularly, the cloning of a cDNA corresponding to a 
novel steroid/thyroid hormone receptor-related gene has been 
achieved. The cDNA was recovered from a human liver cDNA library 
using a labelled cellular DNA fragment previously isolated from a 
liver tumor. The fragment contained a 147 bp putative exon in 
which HBV inserted. The sequence of this cellular gene, which is 
referred to herein as hap for hepatoma, reveals various structur- 
al features characteristic of c-erbA /steroid receptors (Dejean et 
41., 1986). The receptor-related protein is likely to be a novel 
member of the superfamily of transcriptional regulatory proteins 
that includes the thyroid and steroid hormone receptors. 

It has been discovered that the hap. gene is transcribed at 
low level in most human tissues, but the gene is overexpressed in 
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prostate and kidney. Moreover, six out of seven hepatoma and 
hepatoma-derived cell lines express a small hag transcript, which 
is undetectable in normal adult and fetal livers, but is present 
in all non-hepatic tissues tested. Altered expression of hag may 
be involved in liver oncogenesis. 

These findings, as veil as other discoveries relating to 
this invention, will now be described in detail, 

A.i Cloning and Sequencing of a Hap cDNA 

A human liver cDNA library was screened using a nick- 
translated 350 bp EcoRI genomic fragment (MKT probe) previously 
cloned from a hepatoma sample. The fragment contained the puta- 
tive 147 bp cellular exon in which HBV integration took place 
Oejean et al. r 1986). 

Four positive 3' co-terminal clones were isolated from the 2 
x 10^ plaques screened and the restriction maps were deduced for 
each of the cDNA clone Eco RI inserts. The longest one was iden- 
tified lambda-13. The restriction map of lambda-13 is shown in 
Fig. 1. 

Referring to Fig, 1, the insert of clone lambda-13 is nearly 
a full-length cDKA for the hap gene. Noncoding sequence's (lines) 
and coding sequences (boxed portion) are indicated. Restrict ion 
sites are: 

R EcoRI 
fig Balll 
M Mae I 
X Xhol 
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K Kon I 
P Pvu l I 
B BamHI 
H Hind lll. 

The lambda-13 clone was subjected to nucleotide sequence 
analysis. The nucleotide sequence is shown in Pig. 2. The 
nucleotide sequence of the hap cDNA is presented in the 5' to 3* 
~ orientation. The numbers on the right refer to the position of 
the nucleotides. Numbers above the deduced translated sequence 
indicate amino acid residues. The four short open reading frames 
in the 5' untranslated region are underlined. Adenosine residues 
(20) are found at the 3* end of lambda-13. The putative poly- 
adenylation signal site (AATAAA) is boxed. The region homologous 
to the DNA-binding domain of known thyroid/steroid hormone 
receptors is indicated by horizontal arrows. The exon, previous- 
ly cloned from a HCC sample genomic DNA library and in which HBV 
integration took place, is bracketed. 

This invention of course includes variants of the nucleotide 
sequence shown in Fig. 2 encoding hap. protein or a serotypic 
: variant of hap. protein exhibiting the same immunological re- 
activity as hap. protein. 

The OKA sequence of the invention is in a purified form. 
Generally, the DNA sequence is free of human serum proteins, 
viral proteins, and nucleotide sequences encoding these proteins. 
The DNA sequence of the invention can also be free of human tis- 
sue. 
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The DNA sequence of the invention can be used as a probe for 
the detection of a nucleotide sequence in a biological material, 
such as tissue or body fluids. The polynucleotide probe can be 
M labeled with an atom or inorganic radical, most commonly using a 
radionuclide, but also perhaps with a heavy metal. 

In some 'situations it is feasible to employ an antibody 
"which will bind specifically to the probe hybridized to a single 
stranded DMA or RNA. In this instance, the antibody can be la- 
beled to allow for detection. The same types of labels which are 
used for the probe can also be bound to the antibody in accor- 
dance with known techniques. 

Conveniently, a radioactive label can be employed. Radioac- 
tive labels include 32 P, 3 H, 14 C, or the like. Any radioactive 
-label can be employed, which provides for an adequate signal and 
has sufficient half-life. Other labels include ligands, that can 
serve as a specific binding member to a labeled antibody, 
fluorescers, chemiluminescers, enzymes, antibodies which can 
serve as a specific binding pair member for a labeled ligand, and 
the like. The choice of the label will be governed by the effect 
of the label on the rate of hybridization and binding of the 
probe to the DNA or RNA. It will be necessary that the label 
provide sufficient sensitivity to detect the amount of DMA or RNA 
available for hybridization. 

Ligands and anti-ligands can be varied widely. Where a 
ligand has a natural receptor, namely ligands such as biotin, 
thyroxine, and Cortisol, these ligands can be used in conjunction 
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with labeled .naturally occurring receptors. Alternatively, any 
compound can be used, either haptenic or antigenic, in combina- 
tions with an antibody. 

Enzymes of interest as labels are hydrolases, particularly 
esterases and glycosidases, or oxidoreductases, particularly 
peroxidases. Fluorescent compounds include fluorescein and its 
derivatives, rhodamine and its derivatives, dansyl, 
• umbel life rone, etc. Chemi luminescers include luciferin and 
luminol. 

A. 2. Amino Acid Sequence of Protein Encoded by hap Gene 

Based upon the siequence of the hap cDNA, the amino acid se- 
quence of the protein encoded by hap gene was determined. With 
reference to Fig. 2, the deduced amino acid sequence encoded by 
the gene reveals a long open reading frame of 448 amino acids 
corresponding to a predicted polypeptide of relative molecular 
mass 51,000. 

A putative initiator methionine codon and an in-frame 
terminator codon are positioned respectively at nucleotides 322 
and 1666 in the sequence (Fig. 2). However, two other methionine 
codons are found 4 and 26 triplets downstream from the first ATG 
making the determination of the initiation site equivocal. 

The coding sequence is preceded by a 5' region of at least 
321 nucleotides which contains four short open reading frames 
delineated by initiator and stop codons (Fig. 2), Translation 
usually starts,, in eukaryotes, at the 5' most ATG triplet, but 
the finding of open reading frames in the 5' 'untranslated 1 
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region is not unprecedented (Kozak, 1986). it is not known yet 
. whether those sequences are used for translation and exert any 
function in the cell. 

In the 3* untranslated region, 1326 nucleotides long*, no 
long open reading frame is present. A putative polyadenylation 
signal (AATAAA) is found 19 bp upstream from the polyadenylation 
site. 

It will be understood that the present invention is intended 
to encompass the protein encoded by the has gene, i.e. hap. pro- 
tein, and fragments thereof in highly purified form. The hao 
protein can be expressed in a suitable host containing the DMA 
: sequence of the invention. This invention also includes poly- 
peptides in which all or a portion of the binding site of hap. 
protein is linked to a larger carrier molecule, such as a poly- 
peptide or a protein, and in which the resulting product exhibits 
specific binding in vivo and in vitro . In this case, the poly- 
peptide can be smaller or larger than the proteinaceous binding 
site of the protein of the invention. 

It will be understood that the polypeptide of the invention 
encompasses molecules having equivalent peptide sequences. 3y 
this it is meant that peptide sequences need -not be identical. 
Variations can be attributable to local mutations involving one 
or more amino acids not substantially affecting the binding 
capacity of the polypeptide. Variations can also be attributable 
to structural modifications that do not substantially affect 
binding capacity. Thus, for example, this invention is intended 
to cover serotypic variants of h§£ protein. 
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Three particular regions of hap gene are of interest. Two 
of them are located in the D region (amino acids comprised 
between 46 and 196) / /which have been shown by the inventors to be 
highly immunogenic. Amino acids 46-196 have the sequence: 

*GlnHisArgHisThrAlaGlnSerIleGluThrGlnSerThr5erSerGluGlu 
* LeuValProSerProProSerProLeuProProProArgValTyrLysProCysPheValCys 
GlnAspLysSerSerGlyTyrHisTyrGlyValSerAlaCysGluGlyCysLysGlyPhePhe 
ArgArgSerlleGlnLysAsnMetrieTyrThrCysHisArgAspLysAsnCysVallleAsn 
LysValThrArgAsnArgCysGlnTyrCysArgLeuGlnLysCysPheGluValGlyMetSer 
LysGluSerValArgAsnAspArgAsnLysLysLysLysGluThrSerLysGlnGluCysThr 
GluSerTyrGluMetThrAlaGluLeuAspAspLeuThrGluLysI leArgLysAlaHisGln 
GluThrPheProSerLeuCys . 

One peptide of interest in the 0 region is comprised of 
acids 151-167 and has the sequence: 

ValArgAsnAspAsgAsnLysLysLysLysGluThrSerLysGlnGluCys. 

A second peptide in the D region is located between amino 
acids 175 and 185. This peptide has the amino acid sequence: 

AlaGluLeuAspAspLeuThrGluLysIleArg. 

Another peptide of interest is located at the end of C re- 
gion between amino acids 440 and 448. This peptide has the amino 
acid sequence: 

G 1 y Va IS e rG InS e r P r oLeuVa 1G 1 n . 

Other peptides having formulas derived from the nucleotide 
sequence of hap gene can be used as reagents, particularly to 
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obtain antibodies for diagnostic purposes, as defined here- 
inabove. 

The most favorable region is found in the hinge region 
(amino acids 147 to 193). This region includes amino acids 150 
to 170, corresponding to the following criteria: 

The region includes very hydrophi lie sequences, namely, 
the sequences 154-160 (No. 1/Hopp); 155-161 (No. 1/Doolitt le) ; 

155- 159 (No. 1/acrophilic). 

The region includes a peptide, namely, amino acids 

156- 162, No. 5 in mobility. 

The polypeptide of this region has a lov probability of 
adopting a structure in the form of a folded sheet or a helix, 
but, in contrast, a good probability of an omega loop and one 
beta-turn, very marked in the Asp-Arg-Asn-Lys tetrapept ide. 

The region does not have a potential site of 
N-glycosylation nearby; several suggestions in this zone can be 
made: 

Val-Arg-Asn-Asp-Arg-Asn-Lys-Lys-Lys-Lys-Glu-Thr-Ser-Lys- 
Gln-Glu-Cys (peptide 1)? 
Peptide 1 corresponds to amino acids 151-167 and permits finding 
Cys 167, which is present in the sequence and enables attachment 
to a carrier (it will be noted that this peptide corresponds to a 
consensus sequence of phosphorylation by kinase A) . 

Peptide 1 can be shortened by N-turn while preserving the 
beta-turn and by C-turn while replacing Ser by Cys to maintain 
the possibility of coupling at this level: 
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Asn-Asp-Arg-Asn-Lys-Lys-Lys-Lys-Glu-Thr-Cys (peptide 2). 
Peptide 2 is also favorable, but is cleanly less favorable than 
Peptide 1 from the viewpoint of hydrophilicity as of its higher 
potential for spatial organization (probably as amphiphilic 
helix) a 

Finally, it will be noted that the C-terminal end consti- 
tutes a preferred region as a function of its mobility, but it 
nevertheless remains very hydrophobic. For example, the follow- 
ing peptide is contemplated: 

Cys-Gly-Val-Ser-Gln-Ser-Pro-Leu-Val-Gln (peptide 3). 
Peptide 3 can be fixed in a specific manner by an N-terminal Cys 
in such a way as to reproduce its aspect on the protein. 

The nucleotide sequences of hap gene encoding those peptides 
are as follows: 

For peptide 1: 

GTCAGGAATGACAGGAACAAGAAAAAGAAGGAGACTTCGAAGCAAGAATGC. 
For peptide 2: 

GGGGTC ACTCAGTCACCACTCGTGC AA • 
For peptide 3: 

AATGACAGGAACAAG AAAAAG AAGGAG ACT . 
For peptide of amino acids 175-185: 
GCTGAGTTGGACCATCTCACAGAGAAGATTCCG A . 

The polypeptides of the invention can be injected in mice, 
and monoclonal and polyclonal antibodies can be obtained. 
Classical methods can be used for the preparation of hybridomas. 
The antibodies can be used to quantify the amount of human 
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receptors produced by patients in order to correlate the patho- 
logical states of illness and quantity of -receptors or the 
absence of such receptors. 
: * Epi tope-bearing polypeptides, particularly those whose 
N-terminal and Oterminal amino acids are free, are accessible by 
chfcmical synthesis using techniques veil known in the chemistry 
of proteins* For example, the synthesis of peptides in homoge- 
neous solution and in solid phase is well known. 

In this respect , recourse may be had to the solid phase syn- 
thesis of peptides using the method of Merrifield, J. Am. Chem. 
Assoc. 85, 2149-2154 (1964) or the method of synthesis in homoge- 
neous solution described by Houbenweyl in the work entitled 
"Methoden der Organischen demie" (Methods of Organic Chemistry) r 
edited by E. WUNSCH, vol, 15-1 and II, THIEME, Stuttgart (1974), 

This method of synthesis consists of successively condensing 
either the successive amino acid in pairs in the appropriate 
order, or successive peptide fragments previously available or 
formed and containing already several aminoacyl residues in the 
appropriate order, respectively. Except for the carboxyl and 
amino groups which will be engaged in the formation of the 
peptide bonds, care must be taken to protect beforehand all other 
reactive groups borne by these aminoacyl groups and fragments. 
However, prior to the formation of the peptide bonds, the car- 
boxyl groups are advantageous ly activated according to methods 
well known in the synthesis of peptides. Alternatively, recourse 
may be had to coupling reactions bringing into play conventional 



SUBSTITUTE SHEET 



WO 89/05854 



n 



PCT/EP88/01180 



coupling reagents, for instance of the carbodi imide type such as 
l-ethyl-3-(3-dimethyl-aminopropyl)-carbodiimide. When the amino 
acid group carries an additional amino group (e.g. lysine) or 

^another acid function (e.g. glutamic acid), these groups may be 
protected by carbobenzoxy or t-butyloxycarbonyl groups, as 
regards the amino groups, or by t-butylester groups, as regards 
the carboxylic groups. Similar procedures are available for the 
protection of other reactive groups. For example, SH group (e.g. 
in cysteine) can be protected by an acetamidomethyl or 
paramethoxybenzyl group. ' 

In the case of progressive synthesis, amino acid by amino 
acid, the synthesis preferably starts by the condensation of the 

* C-terminal amino acid vith the amino acid which corresponds to 
the neighboring aminoacyl group in the desired sequence and so 
on, step by step, up to the N-terminal amino acid. Another pre- 
ferred technique that can be relied upon is that described by 
R.D. Merrifield in "Solid Phase Peptide Synthesis" (J. Am. Chem. 

' Soc, 45, 2149-2154). In accordance with the Merrifield process, 
the first C-terminal amino acid of the chain is fixed to a suit- 
able porous polymeric resin by means of its carboxylic group, zhe 
amino group of said amino acid then being protected, for example, 
by a t-butyloxycarbonyl group. 

When the first C-terminal amino acid is thus fixed to the 
resin, the protective group of the amino group is removed by 
washing the resin with an acid, i.e. tr if luoroacet ic acid when 
the protective group of the amino group is a t-butyloxycarbonyl 
group. 
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Then the carboxylic group of the second amino acid, which is 
to provide" the second aminoacyi group of the desired peptide se- 
quence, is coupled to the deprotected amino group of the 
:C-terminal amino acid fixed to the resin. Preferably, the car- 
boxyl group of this second amino acid has been activated, for 
example by dicyclohexylcarbodiimide, while its amino group has 
been protected, for example by a t-butyloxycarbonyl group. The 
first part of the desired peptide chain, which comprises the 
first two amino acids, is thus obtained. As previously, the 
amino group is then deprotected, and one can further proceed with 
the fixing of t.ie next aminoacyi group and so forth until the 
whole peptide sought is obtained. 

The protective groups of the different side groups, if any, 
•of the peptide chain so formed can then be removed. The peptide 
sought can then be detached from the resin, for example, by means 
of hydrofluoric acid, and finally recovered in pure form from the 
acid- solution according to conventional procedures. 

Depending on the use to be made of the proteins of the 
invention, it may be desirable to label the proteins. Examples 
of suitable labels are radioactive labels, enzymatic labels, 
flourescent labels, chemi luminescent labels, or chromophores . 
The methods for labeling proteins of the invention do not differ 
in essence from those widely used for labeling immunoglobulin. 
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A. 3. Tissue Specific mRKA Distribution 

In order to study expression of the hap gene, Nor-hern blot 
analysis was performed using MNT as a probe and poly(A)+ RNA 
extracted from various human tissues and cell lines. The results 
are shown in Figure 3. 

More particularly, Northern blot analyses were performed 
with poly(A)+ RNAs (15 ug per lane) extracted from different 
human organs and cell lines. A control hybridization with a mouse 
beta-act in cDNA probe is shown below the hybridizations in Fig. 
3. Hag mRNA in different tissues is shown in Fig. 4A as follows: 

Lane a ovary 

Lane b uterus 

Lane c HBL 100 mammary cells 
Lane d adult spleen 
Lane e 18 weeks fetal spleen 
Lane f K562 

Lane g HL60 hematopoeit ic cell lines 

Lane h prostatic adenoma 

Lane 1 kidney 

Lane j adult liver 

Lane k 18 weeks fetal liver. 
Lanes a-k correspond to a one day exposure. 

Fig. 3 shows that two RNA species of 3 kb and +2.5 kb (the 
size of this smaller mRNA is slightly variable from one organ to 
another) were expressed at low abundance in ovary (lane a), 
uterus (lane b), HBL 100 mammary cells (lane c), adult and fetal 
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spleen (lane d and e, respectively), and K562 and HL60 
hematopoeitic cell lines (lanes f and g, respectively). 
Surprisingly, an approximately tenfold higher level of expression 
was detected in prostatic adenoma (lane h) and kidney (lane i). 
By contrast, a single mRNA of 3000 nucleotides, expressed at low 
levels, was present in poly(A)+ rna from adult and .fetal liver 
tissues (lanes j and k). Therefore, the cloned hap cDNA is like- 
ly to be a full-length copy of this transcript. 

The finding of two mRNA species overexpressed in prostate 
and kidney, as well as the presence of a single mRNA expressed at 
low level in adult and fetal livers show that hap expression is 
differentially regulated in those organs. This tissue specific 
- expression provides some indication that prostate and kidney, as 
well as liver, could be key tissues and that hap functions in 
those cell types may differ. 

Fig. 4 shows hap mRNA in HCC and HCC derived cell-lines as 
follows: -) 

Lane a, normal liver (four days autoradiography); 

Lanes b, c, d: three HCC samples (Lane b, patient Ca; Lane 
c, patient Mo? Lane d, patient TCI); 

Lanes e, f , g: three HCC-derived cell lines (Lane e, 
PLC/PRF/5; Lane f, HEPG2; Lane g, HEP 3B). 

The lanes b-g correspond to a one day exposure. Once again, a 
control hybridization with a normal beta-actin cDNA probe is 
shown below the hybridizations. 
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With reference to Fig, 4 r the smaller 2.5 kb mRNA vas 
undetectable, even after long exposure, i.n three adult and two 
fetal human livers analyzed (Fig. 4, Lane a). This differential 
expression in normal livers may suggest a distinct role of ha'o in 
this particular tissue. 

Northern blot analysis of human HCCs and hepatoma cell lines 
shoved almost constant alterations in hap transcription. There 
are two possible alternatives to explain this result. The 
smaller mRNA species can be simply expressed as a consequence of 
the cellular dedif f erentiat ion. The tumorous liver cell, having 
lost its differentiated characteristics, would behave as any 
other cell type and thus express the same 2.5 kb mRNA as found in 
non-hepatic cells. However, the inability to detect such a 
smaller transcript in fetal livers does not seem to favor this 
hypothesis. On the contrary, the presence of the smaller tran- 
script may have preceded the tumor igenes is events and would rath- 
er reflect a preneoplastic state. The presence of an inappropri- 
ately expressed hap protein, normally absent from normal 
hepatocytes, may have directly participated to the hepatocellular 
'transformation*. In this respect, the previous study reporting a 
KBV integration in the hap gene of a human HCC (Dejean et al., 
1986) strongly supports the idea that hap could be causatively 
. involved in liver oncogenesis. Indeed, in this tumor, a chimeric 
gene between the viral. pre-Sl gene and hap may have resulted in 
the over-expression of a truncated hap protein. At present, it 
is the one found in non-hepatic tissues. 
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A. 4, Expression of hap in Hepatocellular Carcinoma 

Hap was first identified in a human primary liver cancer. 
Encouraged by this finding, poly(A)+ RNA from seven hepatoma and 
hepatoma-derived cell lines were analyzed by Northern-blotting. 
Five of them contained integrated HBV DMA sequences. In addition 
to the 3 kb long mRNA found in normal adult and fetal liver, an 
additional +2.5 kb RNA species was observed, in equal or even 
greater amount, in three out of four HCC (Fig. 4, Lanes b, c, d) 
and in the PLC/PRF/5, HEPG2 and HEP3B hepatoma cell-lines (Lanes 
e, f, g). The size of the smaller transcript was variable from 
sample to sample. In addition, the two transcripts were strik- 
ingly overexpressed, at least ten fold, in the PLC/PRF/5 cells. 

To test the possibility that the inappropriate expression of 
hap in those six tumors and tumorous cell-lines might be the con- 
sequence of a genomic DNA alteration, Southern-blotting of cellu- 
lar DNA was performed using, as two probes, the MNT fragment to- 
gether with a 1 kb EcoRI fragment corresponding to the 5 1 
extremity of the cDNA insert (Fig. 2). No rearrangement and/or 
amplification was detected with any of these two probes which 
detect a different single exon (data not shown), suggesting that 
the hap gene' was not altered at the genomic level. It is yet 
unknown whether the +2.5 kb mRNA, present in the liver tumorous 
samples and cell lines, corresponds to the same smaller tran- 
script as that found in non-hepatic tissues. However, its pres- 
ence in the liver seems to be clearly associated to the 
hepatocellular transformed state. 
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A.5. Hormone-binding Assay 

Amino-acid homologies between the has protein and the 
c-erbA /steroid receptors support the hypothesis that hag may be a 
^receptor for a thyroid/steroid, hormone-related ligand. The abil- 
ity to express functional receptors in vitro from cloned 
c-erbA /st'eroid receptor genes led to the use of an in vitro 
translation assay to identify a putative hap ligand* 

The coding region of hap was cloned into pTZlB plasmid 
vector to allow in vitro transcription with the T7 RNA polymerase 
and subsequent translation in reticulocyte lysates. The results 
are shown in Fig. 5. More particularly, 35 S-methionine-labelled 
products synthesized using T7 polymer ase-catalysed RNA tran- 
scripts were separated on a 12% SDS-polyacrylamide gel, which was 
fluorographed (DMSO-PPO). The lanes in Fig. 5 are as follows: 

Lane a, pCOD 20 (sense RNA, 70 ng) 
. Lane b, pCOD 20 (140 ng) 

Lane c § pCOD 14 (antisense RNA, 140 ng). 

Figure 5 shows that the hag RNA directed the efficient syn- 
thesis of a major protein, with a 51 K relative molecular mass, 
consistent with the size predicted by the amino acid sequence 
(lanes a and b) , whereas the anti-sense RNA-prog rammed lysate 
gave negligible incorporation (lane c). 

Because c-erbA and hap colocalize on chromosome 3 and are 
more closely related according to their amino acid sequence, 
( 125 I)-T3 (triiodothyronine), -reverse T3 (3, 3' ^-triiodo- 
thyronine) and -T4 (thyroxine), were first tested for their 
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binding vjth the in vitro translated haa polypeptide- No specif- 
ic fixation with any of those three thyroid hormones could be 
detected. As a positive control, binding of a T3 vas detected 
with nuclear extracts from HeLa cells. The results were negative 
as well when the experiment was repeated with (3H)-retinol, 
-retinoic acid, and' -testosterone, which represent three putative 
ligands for hap whose receptors have not yet been cloned. 
Although it cannot excluded that hap may encode a hormone inde- 
pendent transcriptional activator, it is more likely that ha.ff 
product, i.e. the hajg protein, is a receptor for a presently 
unidentified hormone. 

A.-6. Similarity of HAP Protein to 

Thyroid/Steroid Hormone Receptors 

The c-erbA gene product, recently identified as a receptor 
for thyroid hormone (Weinberger, et al., 1986; Sap et al., 1986), 
as well as the steroid receptors, belong to a superfamily of reg- 
ulatory proteins, which consequently to their binding with spe- 
cific ligand, appear capable of activating the transcription of 
target genes (reviewed by Yaraamoto, 1985>. This activation seems 
to be the result of a specific binding of the hormone-receptor 
complex to high-affinity sites on chromatin. 

Comparative sequence analysis has been made between the fol- 
lowing different cloned steroid receptors: 

glucocorticoid receptor (GR) (Hollenberg et al., 1985; 
Miesfeld et al., 1986); 
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oestrogen receptor (ER) (Green et al., 1986; Greene et al., 
1986); * 

progesterone receptor (PR) (Conneely et -al., 1986; Loosfelt 
et al., 1986); and 

thyroid hormone receptor ( c-erbA product) (Weinberger et 

al., 1986; Sap et al., 1986).- 

Mutation analysis has also been carried out. (Kumar et al., 
1986; Hollenberg et al., 1987; Miesfeld et al., 1987). The 
results revealed the presence of two conserved regions repre- 
senting the putative DNA-binding and hormone-binding domains of 
those molecules. It has now been discovered that ha£ protein is 
homologous to the thyroid/steroid hormone receptors. 

More particularly, homology previously reported between the 
putative 147 bp cellular exon (bracketed in Fig. 2) and the 
c-erbA /steroid receptor genes led us to compare the entire hap 
predicted amino acid sequence with hGR, rPR, hER, and 
hc-erbA /thvroid hormone receptor. The five sequences have been 
aligned for maximal homology by the introduction of gaps. The 
results are depicted in Fig. 6. Specifically, the following 
nucleotide sequences were aligned after a computer alignment of 
. pairs (Wilbur and Lipman, 1983): 

hap product, 

human placenta c-erbA protein ( hc-erbA , Weinberger et al., 
1986), 

human oestrogen receptor (hER, Green et al., 1986), 
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rabbit progesterone receptor (rPR, Loosfelt et al., 1986), 

and 

human glucocorticoid receptor (hGR, Hollenberg et al., 
: ' : 1985). 

A minimal number of, gaps (-) vas introduced in the alignment. 

Amino acid residues matched in at .least three o£ the poly- 
peptides are boxed in Figure 6. The codes for amino acids are: 



A 


Ala 


Alanine 


C 


Cys 


Cysteine 


D 


Asp 


Aspartic Acid 


E 


Glu 


Glutamic Acid 


F 


Phe 


Phenylalanine 


G 


Gly 


Glycine 


K 


His 


Histidine 


I 


lie 


Isoleucine 


K 


Lys 


Lys i ne 


L 


Leu 


Leucine 


M 


Met 


Methionine 


N 


Asn 


Asparagine 


P 


Pro 


Proline . 


Q 


Gin 


Glut amine 


R 


Arg 


Arginine 


S 


Ser 


Serine 


t 


' Thr 

• 


Threonine 
• 


V 


Val 


Valine 


W 


Trp 


Tyrptophan 
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Y Tyr Tyrosine. 
The sequence comparison analysis revealed that the tvo re- 
gions highly conserved in the thyroid/steroid hormone receptors 
; are similarly conserved in the hag product. Consequently, the 
overall organization of hap is much similar to that of the four 
receptors in that it can be roughly divided into four regions 
(arbitrarily referred to as A/B, C, D and E (Krust et al., 
1986)). 

In C, the most highly conserved region, extending from 
amino-acid 81 to 146 in hag, the nine cysteines already conserved 
between the four known receptors are strikingly present at the 
same positions. Comparison between the cysteine-rich region of 
hap with the corresponding region of the four receptors reveals 
64% amino acid identity with hc-erbA , 59% with hER, 42% with rPR 
and 44% with hGR. This is schematically represented in Fig. 7. 

Referring to Fig. 7, a schematic alignment of the five pro- 
teins can be seen. The division of the thyroid/steroid hormone 
receptor regions A/B, C, D f E is schematically represented in the 
hap protein. The two highly conserved regions, identified as the 
putative DNA-binding (region C) and hormone-binding (region S) 
domains of the receptors, are shown as stippled blocks. The num- 
bers refer to the position of amino acid residues. The sequences 
of each of the hc-erbA product, hER, rPR and hGR receptors are 
compared with the hap protein. The numbers present in the stip- 
pled blocks correspond to the percentage of homology between has 
protein on the one hand arid each of the receptors on the other 
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hand in the two highly conserved regions C and E. The empty 
. blocks correspond to the non-conserved. A/B and D regions. 

It has also been found that hap shares 47% homology in the C 
; : region with the chicken vitamin D3 receptor (VDR), recently 
I cloned as a partial cDNA (McDonnel et al. , 1987) (data not 

shown). Apart from c-erbA . -which contains two additional resi- 
. dues \ P the 66 amino acid long C region shows a constant length in 

hER, VDR, hGR, rPR and hap sequences. 

Region E (residue 195-448), which is well-conserved, but to 

a lesser extent, shows a slightly stronger homology to hc-erbA 

(38%) (Fig. 7). The hap/ hc-erbA homology, however, remains infe- 
rior to the identity found between hGR and rPR C90 and 51 per 
cent in regions C and E, respectively). No significant homology 
was observed when comparing the A/B (residue 1-80) and D 
(147-194) regions which are similarly variable, both in sequence 
and length, in the four known receptors. 

It is thus evident from Figs. 6 and 7 that the hap. product 
exhibits two^ highly homologous regions. The C domain is charac- 
terized by strikingly conserved Cys-X2-Cys units, evoking those 
found in the DNA-binding transcriptional factor TFIIIA (Miller et 
al., 1985) and in some protein that regulated development, such 
as Krappel (Rosenberg et al. , 1986). In the latter, the Cys-X2- 
Cys, together with His-X3-His units, can form metal binding fin- 
gers that are crucial for DMA-binding (Berg, 1986; Diakun et al., 
1986). Similarly, the C domain of previously cloned receptors 
are likely to contain metal binding fingers and were shown to 
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bind DNA (Hollenberg et al., 1987; Miesfeld et al., 1987). Since 
the C region of the hap gene product shares 24/66 conserved amino 
acids with all all steroid or thyroid hormone receptors, 
! including all nine cysteine residues, it is likely that the hap 
protein is a DMA-binding protein. Hap , as c-erbA /steroid 
.receptors, may modulate the transcription of target genes. 

In addition, the significant homology detected in the E 

.domain suggests that hap product is a ligand-binding protein and 
directs the question of the nature of the putative ligand. Hap 
protein seems to differ too much from previously cloned hormone 
receptors to be a variant of one of them. In addition, the in 
vitro translated 51 K hap polypeptide failed to bind all ligands 
tested. Although that hap gene product could be a ligand- 
independent DNA-binding protein, it is believed that hap encodes 
a receptor for a presently unidentified circulating or in- 
tracellular ligand. 

It has been proposed that steroid and thyroid hormone 
receptor genes were derived from a common ancestor (Green and 
Chambon, 1986). This primordial gene may have provided to the 
receptors their common scaffolding while the hormone and target 
gene .cellular DMA specificities were acquired through mutations 
accumulated in the C and E domains. Hap is both linked to the 
steroid receptor gene by its shorter C domain (66AA) and to the 

f thyroid hormone receptor genes by its clearly greater homology 
with c-erbA in the E region (38%). This suggests that hap ligand 
may belong to a different hormone family. 
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Different functions have been assigned to the four regions 
.defined in the glucocorticoid and oestrogen receptors (Kumar et 
■al., 1986; Giguere et al. f 1986; Miesfeld et al., 1987). By 
•{analogy, the regions C and E may represent, respectively, the pu- 
tative DNA-binding and hormone-binding domains of the hap pro- 
tein. The precise functions of the A/B and D domains remain 
unknown. The presence of the amino-terminal A/B region of the 
human GR has been recently shown to be necessary for full tran- 
scriptional activity (Hollenberg et al., 1987), whereas results 
.obtained with the rat GR indicated it was dispensable (Miesfeld 
. et al., 1987). From this alignment study it appears that hap. is 
, distinct, but closely related to the thyroid/steroid hormone 
• receptor genes suggesting that its product may be a novel ligand- 
- dependent, DNA-binding protein. 

A. 7. Hap related penes 

Southern blotting was performed on restriction enzyme- 
digested DMAs obtained from different organisms with labelled 
genomic MNT fragment containing the first axon of the cysteine- 
rich region of ha_b_. The results are shown in Fig. 8. More par- 
ticularly, hap related genes in vertebrates (A) and in humans (3 
and C) were compared. Cellular DMA (20 ug) from various sources 
was digested with Bglli and subjected to Southern blot analysis 
using the MNT probe under non-stringent hybridization and washing 
conditions. The lanes in Fig. 8A are identified as follows: 

Lane a human liver 
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Lane b domestic dog liver 

Lane c voodchuck (marmota monax) . 

Lane d mouse liver (BALB/c strain) 

Lane e chicken erythrocytes 

Lane f cartilaginous fish (Torpedo), 

As illustrated in Fig. 8A, Ball I fragments that anneal 
effectively with MNT probe under non-stringent hybridization and 
washing conditions are present in digests of DNA from several 
mammals (mouse, voodchuck* dog) as well as from bird and fish. 
If this blotting experiment is performed at high stringency, no 
hybridization is observed with heterologous DNA (data not shown). 
These data suggest that the hybridizing sequences represent evo- 
lutionarily conserved homologs of hag. 

The existence of multiple c-erbA and GR ( genes (Jansson et 
al., 1983; Weinberger et al., 1986; Hollenberg et al., 1985) 
encouraged a search for hap related genes in the human genome. 
Thus, human liver DNA digested by Pstl. BamHI, and EcgRI was ana 
lyzed by Southern blot, using the MNT probe, under stringent con 
ditions. The results are shown in Fig. 8B. After digestion of 
' liver DNA by Pstl (lane a), BamHI (lane b), or EcoRl (lane c), a 
single' band is observed with the MNT probe in high stringency hy 
bridization. 

The same blot was hybridized with the MNT probe under non- 
stringent hybridization and washing conditions. The results are 
shown in Fig. 8C. When Southern blotting was performed under re 
laxed hybridization conditions, additional bands were observed i 
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the products of each enzyme digestion (Fig. 8C, lanes a, b, c). 
For example, seven faint hybridizing fragments of 1, 1.7, 2.4, 
3*8, 5*5, 6, 7.4 kb were observed in the BamK I digestion (lane 
: ;b). None of those bands cross-hybridized with a human c-erbA 
probe (data not shown). A minimum of three faint bands in the 
PstI lane suggests the existence of at least four related hap 
genes in the human genome. 

From a panel of somatic cell hybrids, hap was assigned to 
chromosome 3 (Dejean et al., 1986). To find out whether the hap 
related genes were all chromosomally linked or not, DNAs from 
human liver LA.56U and 53K cell-lines (two mouse/human somatic 
cell hybrids containing, altogether, most human chromosomes 
except chromosome 3 (Nguyen Van Cong et al., 1986) ), and mouse 
lymphoid cells were BamH I digested, transferred to 
nitrocellulose, and hybridized to the MNT probe in low-stringency 
conditions. Of the seven faint bands present in the human liver 
DNA track, two at least were conserved in the LA.56U and/or L.53K 
cell lines DNAs digestion (data not shown) indicating that some 
of the hap genes do not localize on chromosome 3. Altogether the 
results isuggest that hap belongs to a multigene family consisting 
of at least four members dispersed in the human genome. 

★ * * 

The experimental procedures used in carrying out this inven- 
tion will now be described in greater detail. 
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A.8 . EXPERIMENTAL PROCEDURES 

A. 8.1. CDMA Cloning and Screening 

Brief ly, the cDNA was synthesized using oligo dT primed 
V'poly-A* liver mRNA, using the method of Gubler and Hoffman (1983) 
(C. de Taisne, unpublished data). cDNA's were size selected on a 
sucrose gradient and the fraction corresponding to a mean size of 
3 kb was treated with EcoRI methylase. After addition of EcoRI 
linkers, the cDNA was digested by EcoRI and ligated to an EcoRI 
restricted lambda-NM1149. After in vitro encapsidation, the 
phages were amplified on C600 hfl and 2.10 6 recombinant were 
plated at a density of 10,000 per dish. The dishes were trans- 
fered to nylon filters and hybridized to the 350 bp EcoRI-EcoRI 
genomic fragment {MNT) previously described (Dejean et al., 
1986K Four positive clones were isolated and the restriction 
map of each insert was determined* The longest one, clone 
lambda-13, was subjected to nucleotide sequence analysis. 

A.8 .2. Nucleotide Seguence 

Clone lambda-13 DNA was sonicated, treated with the Klenow 
fragment of DNA polymerase plus deoxyribonucleot ides (2h, 15°C) 
and fractionated by agarose gel electrophoresis. Fragments of 

» 

400-700 bp were excised and electroeluted, DNA was ethanol- 
. precipitated, ligated to dephosphorylated Smal cleaved M13 mp8 
replication form DNA and transfected into Excherichia coli strain 
TG-1 by the high-efficiency technique of Hanahan (1983). 
Recombinant clones were detected by plaque hybridization using 
either of the four Eco RI fragments of cDNA insert as probes (?ig. 
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1). Single-stranded templates were prepared from plaques, 
exhibiting* positive hybridization signals and were sequenced by 
the dideoxy chain termination procedure (Sanger et al., 1977) 
"using buffer gradient gels (Biggin et al., 1983). 

A.8.3. northern Blot * 

Cytoplasmic RNA was isolated from the fresh tissue using 
guanidine thiocyanate, and the RNA cell Line was extracted using 
isotonic buffer and 0.5% SDS, 10 mn Na acetate pH 5.2. RNAs were 
then treated with hot phenol. Poly(A)+ RNA (15 ug) of the dif- 
ferent samples were separated on a 1% agarose gel containing 
glyoxal, transfered to nylon filters and probed, using the nick- 
translated MNT fragment. The experimental procedure is described 
in Maniatis et al. (1982). 

A. 8. 4. Southern Blot 

20 ug of genomic DNA was digested to completion, fraction- 
ated on a 0.8% agarose gel and transfered to nylon paper. Low 
stringency hybridization was performed as follows: 24 h 
prehybridization in 35% formamide, 5x Denhardt, 5x SSC, 300 
ug/ml denatured salmon\perm DNA, at 40 °C; 48 h hybridization 
with 35% formamide, 5x Denhardt, 5x SSC, 10% Dextran sulfate, 
2.10 s cpm/ml denatured 32 P labelled DNA probe (specific activity 
5.10 8 cpm/ug) . Washes were made in 2x SSC, 0.1 SDS, 55 a C for 15 
min. High stringency hybridization conditions were the same , 
except that 50% formamide was used with 24' h hybridization. 
Washing was in O.lx SSC, 0.1 SDS, 55°C for 30 min. 
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A.8.5. Construction of Plasmids for I n-Vitro Translation 

The 3 kb insert of phage lambda-13 .was excised from the 
phage DNA by partial EcoRI digestion, electroeluted and digested 
by BamHI and Hindu I. To remove most of the untranslated se- 
quences, the 1.8 kb cDNA fragment obtained was then partially 
digested by Mail (Boehringer) . The 1.4 kb Mael-Mael fragment, 
extending from the first to the third £ael site in the cDNA in- 
sert sequence (Fig. 1) and containing the complete coding region' 
was mixed with Smal cut dephospho'rylated pTZl8 (Pharmacia)* the 
extremities were filled in using Kleenow fragment of DMA Poll 
(Amersham) and ligated. Two plasmids were derived: pCOD20 
(sense) and pCOD14 (antisense). 

A. 8. 6. Translation and hormone binding assays 

pCOD20 and pCOD14 were linearized with Hindi II. Capped mRHA 
was generated using 5 ug of DNA, 5 uM rNTP, 25 mM DTT, 100 U 
RNAsin (Promega), 50 U T7 Pol (Genofit) in 40 mM Tris pH 8, 8mM 
. MgCl2, 2 mM spermidine, 50 mM NaCl, in 100 ul at 37°C. Capping 
was performed by omitting GTP and adding CAP (m 7 G (5*) ppp (5') 
G) (Pharmacia) for the 15 first minutes of the reaction. Trans- 
lation was performed using rabbit reticulocyte lysate (Amersham) 
under the suggested conditions using 40 ul of lystae for 2.5 ug 
of capped RNA. 

The thyroid hormone binding assays included 5 ul of lysate 
in (0.25 M sucrose, 0.25'KC1, 20 mM Tris (pH 7.5), 1 mM MgCl2, 2 
mM EDTA, 5 mM DTT) with 1 mM 125 I T4, 125 I T3 or 125 I rT3 (spe- 
cific activity: T4, rT3 1400 mCi/mg Amersham, T3 3000 nCi/mg 
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: NEN) . After at least 2 h of incubation at 0°.C, free was sepa- 
rated from bound by filtration through millipore HAWP 02500 fii- 

: ters using 10 ml of ice cold buffer* For testosterone, retinol, 
retinoic acid 10 ul of lysate were added to 45 lambda of 20 mM 
Tris pH 7.3, 1 mM EOT A, 50 mM NaCl, 2 mM beta-mercaptoethanol and 
5 mM testosterone, 400 mM retinol or 15 mM retinoic acid (81 
Ci/mmol;. 60 Ci/mmol; 46 Ci/mmol; Amersham). After an overnight 

5 incubation at 0°C free was separated from bound by Dextran coated 
charcoal (0.5% Korit A - 0.05% T70) and centr ifugation. All 

* experiments vere performed in duplicates and parallel experiments 

* were performed with 100 fold excess corresponding cold hormone. 

* * * 

B. 'DIFFERENTIAL EXPRESSION AND LIGAND REGULATION OF THE 
RETINOIC ACID RECEPTOR a AND S GENES 

The recent cDNA cloning of several nuclear hormone 

* receptors, including the steroid and thyroid hormone receptors, 
has revealed that their overall structures vere strikingly simi- 
lar. In particular, two highly conserved regions have been shown 
to correspond to the DNA- and hormone-binding domains (for review 

» see Evans , 1988 ) . 

Analysis of a hepatitis B virus integration site in a human 
hepatocellular carcinoma led to the identification of a putative 
genomic exon highly homologous to the DNA-binding domain of other 
members of this nuclear receptor multigene family (Dejean et al., 
1986). Two different cDNAs homologous to this sequence have re- 
cently been cloned (Giguere et al., 1987* Petkovich et al., 1987; 
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de The et al., 1987) and their translation products identified as 
retinoic acid receptors (designated RAR a and RAR 8) (Giguere et 
al», 1987; Petkovich et al., 1987; Brand et al., 1988). The two 
receptors have almost identical DWA- and hormone-binding domains 
but differ in their N-terminal part. Their respective genes map 
to different. chromosomes, 17q21.1 for RAR q (Mattei et al., 1988) 
and 3p24 for RAR 8 (Mattei et al., 1988), and their nucleotide 
sequences are only distantly related. Both genes are found in 
most species (Brand et al., 1988 and de The, unpublished 
results), suggesting an early -gene duplication. Analysis of the 
RA-dependent gene transact ivat ion also showed that the ED 50 of 
RAR a and 8 were significantly different (10~ 8 and 10~ 9 M, re- 
spectively), indicating that RAR-B may mediate activation of 
transcription at RA concentrations 10-fold lower than those nec- 
essary for activation by RAR a (Brand et al., 1988). 

The existence of two different retinoic acid receptors 
raises a number of questions as to the biological consequences of 
the RAR gene duplication. In particular, differences in the 
mechanisms of regulation or spatial expression patterns of the 
two receptors could account for distinct physiological roles. 
The tissue distribution of the transcripts for RAR a and 8 and 
their response to RA have been studied. The results show clear 
differences in the spatial patterns of expression and indicate 
that the 8, but not the a, RAR gene is transcriptionally 
upregulated by RA in a protein synthesis- independent fashion. 
The discovery of differential expression of the RAR a and 8 
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genes, coupled with a selective regulation of RAR 8 gene expres- 
sion by RA, may prove to be important components of retinoic acid 
physiology. These findings strongly suggest that the two 
•receptors are differentially involved in the various biological 
effects of RA. The results obtained in the study are summarized 
. below. 

The RAR a gene, which is transcribed as two mRNA species of 
3.2 and 2.3 kb, is overexpressed in the haematopoietic cell-lines 
and has an otherwise low level-expression in all the .other human 
tissues examined. By contrast, the RAR B gene exhibits a much 
more varied expression pattern. Indeed, the two transcripts, 3 
and 2.5 kb, show large variations in their levels of expression 
which range from undetectable (haematopoietic cell-lines) to rel- 
atively abundant (kidney, cerebral cortex, etc.). Run-on studies 
with the hepatoma cell-lines show that, at least in some tissues, 
these differences may be due to an increase in the transcription 
rate of the RAR B gene. These findings point to complex regula- 
tory mechanisms of RAR gene expression that may confer the cells 
with various sensitivities to RA. 

The availability of cloned RAR cDMAs prompted an investiga- 
tion off possible regulation of these receptor mRNAs by RA. Expo- 
sure of hepatoma cells to RA led to a rapid increase in the level 
of RAR B transcripts, while the abundance of RAR a transcripts 
remained unaffected. The stimulation of expression of RAR B 
mRNAs was induced by physiological concentrations of RA in a 
dose^-dependent manner. Such autoregulation is a general feature 
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of hormonal systems and has been shown to take place at the mRNA 
,and protein levels, in the case of the nuclear receptors for 
glucocorticoids (down-regulation, Okrent et al., 1986) or vitamin 
- D3 (up-regulation, McDonnell et al., 1987). The RA-induced 
upregulation of the RAR 8 transcripts was observed in the pres- 
ence of protein synthesis inhibitors. In vitro nuclear tran- 
'script run-on assays show that the RA-induced increase in RAR S 
mRNAs levels is the consequence of an enhanced transcription. 
"These findings demonstrate that the RAR 6 gene is transcrip- 
tionally upregulated by the RA and provide the first identifica- 
tion of a primary target gene for RA. The cloning of the 
promoter sequences of the RAR 8 gene should allow the identifica- 
tion of the' upstream genomic elements implicated in RA respon- 
siveness. The use of these sequences will provide a useful tool 
to determine which one of the a and/or the 8 receptor is involved 
in regulating 8 RAR gene expression. 

The haematopoietic cell-line HL60 has been widely used as a 
model for RA-induced differentiation (Strickland and Mahdavi, 
1978). The data from this invention suggest that in this system 
RAR o must be responsible for the RA-induced differentiated 
phenotype, since HUOi.does not appear to have any RAR 8 mRNAs. 
.Mote in this respect that Davies et al. (1985) studying the 
RA-dependent transglutaminase expression in these ceils have 
found an ED 50 of 5xl0" 8 M consistent with a RAR a-mediated trans- 
activation. 
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The upregulation of the S receptor gene by RA may have very 
important -implications in developmental biology. Morphogen gra- 
dients are frequently implicated in cell commitment (Slack, 
1987). One example of this phenomenon is the polarization of the 
chick limb bud where RA, the suspected morphogen, forms a concen- 
tration gradient across the anterior-posterior axis of the 
developing bud (Thaller and Eichele, 1987). However, the small 
magnitude of this gradient (2.5 fold) is puzzling and suggests 
the existence of amplification mechanisms (Robertson, 1987). 
Since transactivation of target genes is dependent upon both 
receptor and ligand concentrations, a small increase in RA may 
result in a disproportionately larger RAR S effect. The effect 
of this RA gradient could be potentiated by a corresponding gra- 
dient in RAR B receptors as a consequence of upregulation by RA 
itself. 

B.l. Tissue distribution of the a and 8 RAR mRNAs. 

To study the differential expression of the RAR a and 8 
genes, Northern blot analysis was performed using 5 ug 
(microgram) of poly(A) ♦ RNA extracted from various human tissues 
and cell-lines. A RAR B clone previously identified (de The et 
al., 1987) was used to isolate a partial cDNA clone for RAR a 
from a hepatoma cell-line cDNA library, and the two cDNA inserts 
were used as probes. More particularly, poly(A) + mRNA (5 ug) 
from different human tissues and cell-lines was denatured by 
glyoxal, separated on a 1.2% agarose gel, blotted onto nylon fil- 
ters and hybridized to an o (Fig. 9, upper panel), then a B (Fig. 
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9, middle panel) RAR cDNA single-stranded probe (see materials 
and methods, infra). Exposure time was 36 h. The filters were 
subsequently hybridized to a 8 actin probe (Fig. 9, lower panel) 
to ensure that equal amounts of RNA were present in the different 
lanes. The following abbreviations are used in Fig. 9. Sp. 
cord: spinal cord. C. cortex: cerebral cortex. K562 and HL60 are 
two haematopoietic cell-lines. PLC/PRF/5 is a hepatoma derived 
cell-line. 

Referring to Fig. 9, the spatial distribution patterns were 
clearly distinct between the two receptors. The RAR a probe hy- 
bridized to two transcripts of 3.2 and 2.3 kilobases (kb) with an 
approximately equal intensity. The two mRNAs were present at low 
' levels in all tissues examined but were overexpressed in the 
haematopoietic cell-lines, K562 and HL60. 

"When the same filters were hybridized with the RAR B probe, 
a much more variable transcription pattern was observed (Figure 
9). Two mRNA species of 3 kb and 2.5 kb were visible in most 
tissues, except in the spinal cord and the liver (adult or fetal) 
where the smaller transcript was undetectable. Major quantita- 
tive differences in the level of expression of the two tran- 
scripts were noted. The tissues examined could be classified 
into four groups with respect to expression of 8 receptor mRNAs: 
high (kidney, prostate, spinal cord, cerebral cortex, PLC/PRF/5 
cells), average (liver., spleen, uterus, ovary), low (breast, tes- 
tis) and undetectable (K562 and HL60 cells). The use of a B 
probe that did not hybridize to a, allowed us to correct our 
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previous description of 3 RAR transcripts in these haematopoietic 
cell-lines* (de The et al., 1987). The suppression of B receptor 
gene expression, associated with an overexpression of RAR a mRNAs 
seems to be a general feature of haematopoietic cell-lines, since 
similar results vere obtained when ve repeated the study using 
six other cell-lines (HEL, LAMA, U937, KG1, CCRF, Burkitt) (data 
not shovn). 

RA- induced mRNA regulation . 
To investigate vhether retinoic acid modulates the expres- 
sion of its ovn receptor, PLC/PRF/5 cells were grown in the pres- 
ence of various concentrations of RA for different times, and RAR 
a and B mRNAs vere analysed by Northern blot hybridizat ion* More 
particularly, semi-confluent cells vere grown for 6 hr in char- 
coal stripped medium and retinoic acid vas then added to the me- 
dium at various concentrations (10~ 10 M to 10" 6 M) for ^ hr. 
Control cells vere treated vith ethanol (E). Northern-blotting 
vas performed as described in connection vith Figure 9, except 
that 30 ug of total RNA vas used. Dose-response is shovn in Fig. 
10A. 

Another analysis was performed as in Fig. 10A, except that 
1(T 6 M RA vas used for various times (0-12 h) . Time-response is 
shovn in Fig. 10B. Exposure time vas 12 hr for the 8 probe (Fig. 
10B, lover panel) and four days for the a probe (Fig. 10B, upper 
panel). 

When the cells vere treated vith a high concentration of RA 
(1Q~ 6 m), a rapid increase in 8 receptor mRNAs vas observed, and 
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a dose-response analysis showed that this stimulatory effect vas 
already evident at a RA concentration of 10~ 9 M (Fig. 10A, lower 
panel). Prom densitometry, the magnitude of the RA- induced 
upregulation was 10-fold. 

Since the PLC/PRF/5 cells constitut ively overexpress the RAR 
8 mRNAs (Fig. 9), the experiment was repeated using the HEPG2 
hepatoma cell-line, which has a level of RAR B expression similar 
to that of normal adult liver (de The et al., 1987), In this 
case, there was a greater (50-fold) RA- induced stimulation of the 
levels of RAR 8 mRNAs (data not shown). Exposure of the 
PLC/PRF/5 cells to RA (10~ 6 M) during various periods indicated 
that the induction had a latency of one hour, was complete after 
four hours, and did not decrease after an overnight treatment 
(Fig. 10B, lower panel). After hybridizing the same filters with 
an RAR a probe, no variation was found in the level of the a 
receptor mRNAs (Fig. 10, upper panel), indicating that RA had no 
effect on the expression of the RAR a gene. 

B.3. Effect of inhibitors . 

To investigate the mechanism of activation of RAR 8 gene by 
RA, experiments with PLC/PRF/5 cells were performed in the pres- 
ence or absence of various inhibitors of transcription or trans- 
lation, or were treated with ethanol (£) as a control. 

More particularly, PLC/PRF/5 cells were exposed to charcoal 
stripped medium for 6 ftr; subsequently ethanol (E), RA (10~ 6 M) 
and/or inhibitors cycloheximide (CH) 10 ug/ml or actinomycin D 
(AC) (5 ug/ml) were added for an additional 4 hr. 
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Northern-blotting was carried on using 30 ug of total RNA. Fig- 
ure. 11 shows filters hybridized first tg the RAR 8 probe (Fig, 
11, right panel), then to the a probe (Fig. 11, left panel), and 
-finally to a S actin probe (Fig. 11, lower panel). Exposure 
times were the same as for the experiments in Figure 10. 

The RNA synthesis inhibitor actinomycin D (AC) abolished the 
RA-induced increase in the levels of RAR 8 transcripts (compare 
the RA+AC lane to the RA and E+AC lanes), while the protein syn- 
thesis inhibitor cycloheximide (CH) did not (compare lanes RA+CH 
to CH) , Neither RA, AC, nor CH signif icantly affected the levels 
of 8 actin mRNA (Fig. 11, lower panel). These findings suggest 
that RA-induction of the 8 receptor gene results from a direct 
transcriptional effect. When the same filters were rehybridized 
to the RAR a probe (Fig. 11, left panel) the presence or absence 
of RA had no effect on the levels of RAR a mRNAs confirming that 
the RAR o gene is not regulated by RA. 

B.4. Nuclear transcript elongation analysis . 

Nuclear run-on experiments were carried out to determine if 
the enhanced expression of the RAR 8 gene was due to increased 
transcription. PRF/PLC/5 cells were grown in the presence of 
ethanol (E) or retinoic acid (RA), their nuclei were isolated, 
and transcription was performed in the presence of ( 32 P)UTP. The 
labelled RNAs were hybridized to filters containing 
single-stranded RAR 8 cDNA inserts in the appropriate orientation 
(S (sense) 10 ug and 1 ug) , or in the reverse orientation (AS 
(antisense) 20 ug) . A 8 actin control was also included. 
Exposure time was 12 hours. The results are shown in Figure 12. 
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The specific hybridization, which reflects the transcription 
rate r is clearly induced by RA. In addition, the magnitude of 
the increase in RAR 8 mRNAs is comparable when assessed by run-on 
assays (5 to 7 fold) or Northern analysis (8 to 10 fold). These 
experiments establish that the RAR 8 gene is transcriptionally 
upregulated by RA. 

Nuclear transcript elongation assays were also used to in- 
vestigate whether the higher steady-state levels of RAR 8 mRNAs 
observed in the hepatoma cells PRF/PLC/5 compared to HEPG 2 (de 
The et al., 1987), were related to differences in transcription 
rates. Transcript elongation assays were performed with 
PRF/PLC/5 and HEPG2 cells as described below in material and 
methods, in the absence of added RA. The filters contained, re- 
spectively, 10 yg and 20 ug of sense (S) and antisense (AS) RAR 8 
cDNA inserts. Exposure time was 24 hours. The results are shown 
in Figure 13. 

A much greater specific hybridization signal, relative to 
the 8 actin control, was observed in PRF/PLC/5 cells compared to 
the HEPG 2 cells (Fig. 13), indicating that their transcription 
rates are different. This result suggests that at least some of 
the variations in RAR 8 expression in the human tissues and 
cell-lines (Fig. 9) might be due, in a similar manner, to differ- 
ences in the transcription rates of the RAR 8 gene. 
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B.5. Stability of RAR mRNAs 

The level of RAR S mRNAs was slightly higher after 
cycloheximide treatment (compare the E lane to the CH lane in 
Fig. 11, right panel). In the presence of RA, CH treatment 
caused approximately a 50-fold increase in the level of RAR 8 
gene expression (compare Lane E to RA+CH) . Such super induct ion 

• by cycloheximide has been described for several genes and associ- 
ated with either transcriptional or post-transcriptional mecha- 
nisms (Greenberg et al. , 1986). 

To determine whether RNA stabilization was involved in the 
induction by CH, PLC/PRF/5 cells were first stimulated for 3 
hours by RA (10~ 6 M) in the presence of CH (10 ug/ml) and exten- 

: sively masked with culture medium. Transcription was then 
blocked by addition of actinomycin D (5 ug/ml) and the level of 
RAR mRNAs was monitored for the next 5 hours in the presence or 
absence of CH. Northern-blotting was done using 30 ug of total 
RNA. The results are shown in Figure 14. The filters were hy- 
bridized first to the RAR 6 probe (Fig. 14, right panel), then to 
the a probe (Fig. 14, left panel), and lastly to a B actin probe 
(Fig. 14, lower panel). Exposure times were as in Figure 10. 

Quantification of the RAR 8 mRNAs levels indicated that CH 
indeed stablized the B transcripts, as CH increased their 
half-life from approximately 50 to 80 min (Fig. 14, right panel). 
The combined effect of. increased transcription and reduced degra- 
dation may account for the synergistic effect of RA and CH on 6 
mRNAs levels. In the case of RAR a, cycloheximide treatment 
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caused only a slight increase in mRNAs levels and no 
super induct ion by RA was observed (Fig. 11, left panel). In 
addition, the a receptor mRNAs, which have a half life of at 
least 5 hours, are more stable than the BAR 8 transcripts (Fig. 
14, left panel) o A pentanucleot ide, ATTTA, in A/T rich 3' 
non-coding -regions seems to mediate mRNA degradation (Shav and 
Kamen, 1986). The 3,2 kb RAR a transcript has an A/T poor 3* end 
(38%) and contains two such motifs (Giguere et al., 1987; 
Petkovich et al., 1987), whereas the 3 kb RAR 8 mRNA has an A/T 
rich 3 f end (68%) and four copies of ATTTA (de The et al., 1987). 
These findings are consistent with the differences in RAR a and 6 
mRNAs stability that have been observed. 

B.6. MATERIAL AND METHODS 

B.6.1. Biological samples and cell-lines * 
Human tissue samples were obtained from early autopsies and 
kept at -80°C prior to extraction. The HEPG 2 and PLC/PRF/5 
hepatoma cell-lines were grown in Dulbecco's modified Eagle's me- 
dium with 10% fetal calf serum, glutamine, and antibiotics, in 5% 
C02. Semiconf luent cells were treated with RA after a 6 h wash- 
out in charcoal stripped medium. .All-trans-retinoic acid was 
obtained from Sigma. Cycloheximide and actinomycin D (both from . 
Sigma) were used at concentrations of 10 and 5 ug/ml (micrograms/ 
milliliter), respectively. 
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B.6.2. RNA preparation . 

The RNA was prepared by the hot phenol procedure (Maniatis 
et al., 1982). Poly(A)* mRNA was prepared by oligo(dT)-cellulose 
chromatography. For Horthern-blot analysis, total RNA (30 ug) or 
poly(A) + mRNA (5 yg) was denatured by glyoxal and fractionated on 
a 1.2% agarose gel (Maniatis et al., 1982). The nucleic acid was 
transferred to nylon membranes (Amersham) by blotting and 
attached by UV exposure plus baking. 

B.6.3. Recombinant clones . 

The 8 receptor probe was a 6Q0 bp fragment of the cDNA pre- 
viously described (de The et al., 1987) extending from the 5' end 
to the Xho I site, corresponding to 5 f untranslated region and 
the A/B domain. The a receptor probe was a short cDNA insert 
that was isolated from a PLC/PRF/5 human hepatoma cell-line cDNA 
library generated as described (Watson and Jackson, 1986). This 
library was hybridized with an RAR 8-derived probe (nucleotides 
550 to 760) corresponding to the conserved DNA-binding domain of 
RAR B. A weakly hybridizing plaque was purified, subcloned into 
M13mpl8, and sequenced by the dideoxy procedure. This clone was 
found to be identical to RAR a and extended from nucleotides 358 
to 587 r corresponding to the C and D domains (Giguere et al., 
1987), Since this cDNA insert contains some regions homologous 
to the RAR 8 cDMA, cross-hybridizat ion has been occasionally ob- 
served, particularly i.i cell-lines that overexpress RAR B mRNAs. 
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B.6.4. Hybridization procedure . 
. The tvo cDMA inserts were subcloned into M13 and used to 
generate high specific activity (greater than 10^ cp.m./ug) 
single-stranded probes by elongation of a sequencing primer with 
32 P labelled dTTP (3000 Ci/mmol) and unlabelled nucleotides by 
Klencv polymerase. The resulting double-stranded DKA was 
digested using a unique site .in the vector, fractioned on a 
urea/acryiamide sequencing gel, and the labelled single-stranded 
insert electroeluted. These probes (5xl0 6 cpm/ml) were hy- 
bridized to the filters in 7% (v/v) sodium dodecyl sulfate (SDS), 
0.5 M NaPO^ pH 6.5, 1 mM ethylenediaminetetraacetate (EDTA) , and 
1 mg/ml bovine serum albumin (BSA) at 68 °C overnight. The fil- 
ters were washed in 1% SDS, 50 mM NaCI, 1 mM EDTA at 68°G for 10 
min and autoradiographed at -70°C using Kodak XAR films and in- 
tensifying screens, A mouse 8 act in probe was used to 
rehybridize the filters and check that all lanes contained equal 
amounts of RNA. 

B.6.5. Nuclear run-on experiments . 

Nuclear transcript elongation assays were performed as 
described (Mezger et al., 1987). PLC/PRF/5 or HEPG 2 cells (10 8 ) 
were challenged with ethanol or with 10~ 6 M RA for 6 hours in 
charcoal-stripped medium. After isolation of the nuclei, tran- 
scription was performed in a final volume of 100 ul (microliters) 
with 150 uCi (microcuries) of (a 32 P) UTP (3000 Ci/mmol). Typical 
incorporation ranged between 2 and 6xl0 7 cpm. The labelled RNA 
was hybridized to nylon filters (Amersham) containing 10 ug and 1 
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ug of a 3' end RAR 6 cDNA insert (position 2495 to 2992, de The 
et al., 19-87) cloned in M13; 20 ug of the same insert in the re- 
verse orientation were included as a negative control. A piasmid 
containing a mouse B actin insert (4 ug) provided a positive and 
quantitative hybridization control. Hybridization was performed* 
vith a probe concentration of 2-6xl0 7 cpm/ml for 48 hours. 

The relative intensity of hybridization signals in 
Northern-blotting and run-on experiments was estimated using a 
Hoefer scanning densitometer and the appropriate computer pro- 
gram. 

Our results shoving a direct autoregulation of the tran- 
scription of the RAR- 8 gene implies that the retinoic acid 
receptor 8 binds to its own gene promotor sequences. To identify 
those sequences, several 5* coterminal RAR- 8 cDNA clones were 
derived from the PRF/PLC/5 library previously described. 
Nucleotide sequence analysis showed that these clones extended 
our previous X 13 RAR-8 clone by 72 bp, which are shown in Figure 
15. Thus, this invention also provides the 72 bp nucleotide se- 
quence shown in Fig. 15, as well as a cloned DNA sequence 
encoding a polypeptide of hag gene, wherein the sequence has the 
formula 



CCCATGC 

3 
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GCAGCATCGGCACACTGCTCAATCAATTGAAACACAGAGCACCAGCTCTGAGGAACTCGTCCCAAG 
CCCCCCATCTCCACTTCCTCCCCCTCGAGTGATCAAACCCTGCTTCGTCTGCCAGGACAAATCATC 
AGGGTACCACTATGGGGTCAGCGCCTGTGAGGGATGAAGOGCTTTTTCCGCAGAAGTATTCAGAAG 
: : AATATGATTTACACTTGTCACCGAGATAAGAACTGTGTTATTAATAAAGTCACCAGGAATCGATGC 
CAATACTGTCGACTCCAGAAGTGCTTTGAAGTGGGAATGTCCAAAGAATCTGTCAGGAATGACAGG 
AACAAGAAAAAGAAGGAGACTTCGAAGCAAGAATGCACAGAGAGCTATGAAATGACAGCTGAGTTG 
GACGATCTCACAGAGAAGATCCGAAAAGCTCACCAGGAAACTTTCCCTTCACTCTCGCAGCTGGGT 
AAATACACCACGAATTCCAGTGCTGACCATCGAGTCCGACTGGACCTGGGCCTCTGGGACAAATTC 
AGTGAACTGGCCACCAAGTGCATTATTAAGATCGTGGAGTTTGCTAAACGTCTGCCTGGTTTCACT 
GGCTTGACCATCGCAGACCAAATTACCCTGCTGAAGGCCGCCTGCCTGGACATCCTGATTCTTAGA 
ATTTGCACCAGGTATACCCCAGAACAAGACACCATGACTTTCTCAGACGGCGTTACCCTAAATCGA 
ACTCAGATGCACAATGCTGGATTTGGTCCTCTGACTGACCTTGTGTTCACCTTTGCCAACCAGCTC 
CTGCCTTTGGAAATGGATGACACAGAAACAGGCCTTCTCAGTGCCATCTGCTTAATCTGTGGAGAC 
CGCCAGGACCTTGAGGAACCGACAAAAGTAGATAAGCTACAAGAACCATTGCTGGAAGCACTAAAA 
ATTTATATCAGAAAAAGACGACCCAGCAAGCCTCACATGTTTCCAAAGATCTTAATGAAAATCACA 
GATCTCCGTAGCATCAGTGCTAAAGGTGCAGAGCGTGTAATTACCTTGAAAATGGAAATTCCTGGA 
TCAATGCCACCTCTCATTCAAGAAATGATGGAGAATTCTGAAGGACATGAACCCTTGACCCCAAGT 
TCAAGTGGGAACACAGCAGAGCACAGTCCTAGCATCTCACCCAGCTCAGTGGAAAACAGTGGGGTC 
AGTCAGTC ACCACTCGTGCAATAA , 

and serotypic variants thereof, wherein said DNA is in a purified 
! form. 

This 72 bp sequence was used as a probe to screen a human 
" genomic library. Six overlapping clones were derived, and a 6 kb 
: Hindlll - BamHI insert containing the probe was subcloned into 
PTZ 16 at the same sites to give rise to the plasmid pPROHAP. 
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Since this genomic DNA insert is limited by the Bam HI site pres- 
ent in the original X 13 clone and contains the additional 72 bp 
of the 5 f end of the mRNA, it also contains the promoter region 
and all the elements necessary for the RAR-B gene expression and 
regulation. Preliminary SI analysis using the plasmii pPROHAP 
.end labelled at the BamHI site suggest that the cloned RAR-B cDNA 
are full-size and that the cap site is indeed located in the 90 
bp BamHI -EcoRX fragment, 

A complete restriction map of the Hind lll- BamH I genomic DNA 
insert is shown in Figure 16. 

Plasmid pPROHAP was transfected into the EL coli strain 
DHSaF 1 (from B.R.L. ) • A viable culture of coli strain DHSaF 1 
transformed with plasmid pPROHAP was deposited on November 29, 
1988, with the National Collection of Cultures of Microorganisms 
or Collection Nationale de Cultures de Micro-organisms (CN.C.M.) 
of Institut Pasteur, Paris, France, under Culture Collection 
Accession No, C.N. CM. 1-821. 

This DNA insert, which is characterized by its restriction 
map and partial nucleotide sequence (or some of its fragment), 
provides a tool to assess RAR-B function, because it must contain 
a RAR responsive enhancer. Several constructs in which this 
promotor region controls the expression of indicator genes, such 
as the 8-galactosidase or the chloramphenicol acetyl transferase 
(CAT), have been designed. Transient or stable expression, in 
eucaryotic cells, of these constructs, together with an expres- 
sion vector of RAR-B, provides a useful model system to directly 
assess stimulation of RAR- 8 by a retinoid. 
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Thus, this invention also provides a recombinant DNA mole- 
cule comprising a DNA sequence of coding for a retinoic acid 
receptor, said DNA sequence coding on expression in a unicellular 
host for a polypeptide displaying the retinoic acid and DNA bind- 
ing properties of RAR-B and being operatively linked to an 
expression .control sequence in said DNA molecule. 

It should be apparent that the foregoing techniques as well 
as other techniques known in the field of medicinal chemistry can 
be employed to assay for agonists and antagonists of ligand bind- 
ing to RAR-B and binding of the RAR-B protein to DNA. Specifi- 
cally, this invention makes it possible to assay for a substance 
that enhances the interaction of the ligand, the RAR-B protein, 
the DNA, or combinations of these materials to elicit an observa- 
ble or measurable response. The substance can be an endogenous 
physiological substance or it can be a natural or synthetic drug. 

This invention also makes it possible to assay for an antag- 
onist that inhibits the effect of an agonist, but has no biologi- 
cal activity of its own in the RAR-B effector system. Thus, for 
example, the invention can be employed to assay for a natural or 
• synthetic substance that competes for the same receptor site on 
the RAR-B protein or the DNA that the agonist occupies, or the 
invention can be employed to assay for a substance that can act 
on an allosteric site, which may result in allosteric inhibition. 

It will be understood that this invention is not limited zo 
assaying for substances that interact only in a particular way, 
but rather the invention is applicable to assaying for natural or 
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synthetic substances , which can act on one or more of the 
receptor or recognition sites, including agonist binding sites , 
competitive antagonist binding sites (accessory sites), and 
non-competitive antagonist or regulatory binding sites 
(allosteric sites) . 

A convenient procedure for carrying out the method of the 
invention involves assaying a system for stimulation of RAR-S by 
a retinoid. For example, as a retinoid binds to the receptor, 
the receptor-ligand complex will bind to the responsive promotor 
sequences and vill activate transcription. For example, tran- 
scription of the B-galactosidase or CAT genes can be determined. 
The method of this invention makes it possible to screen 
B-receptor binding retinoids. In addition, this invention makes 
it possible to carry out blood tests for RAR-8 activity in pa- 
tients. 

* * * 

In summary, a hepatitis B virus (HBV> integration in a 147 
bp cellular DNA fragment homologous to steroid receptors and 
c^erbA/ thyroid hormone receptor genes previously isolated from a 
human hepatocellular carcinoma (HCC) was used as a probe to clone 
the corresponding complementary DMA from a human liver cDNA li- 
brary. The nucleotide sequence analysis revealed that the over- 
all structure of the cellular gene, named hae, is similar to that 
of DNA-binding hormone receptors. That is, it displays two 
highly conserved regions identified as the putative DMA-binding 
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53 



and hormone-binding domains of the c-erbA /steroid receptors. Six 
out. of seven hepatoma and hepatoma-derived cell-lines express a 
2.5 kb hap mRNA species which is undetectable in normal adult and 
fetal livers but present in all non-hepatic tissues analyzed. 
Low stringency hybridization experiments revealed the existence 
of hap related genes in the human genome. Taken together, the 
data suggest that the hap product may be a member of a new family 
of ligand-responsive regulatory proteins whose inappropriate 
expression in liver seems to correlate with the hepatocellular 
transformed state. 

Because the known receptors control the expression of target 
genes that are crucial for cellular growth and differentiation, 
an altered receptor could participate in the cell transformation. 
In that sense, avian v-erbA oncogene, which does not by itself 
induce neoplasms in animals, potentiates the erythroblast trans- 
formant effects of v-erbB and other oncogenes of the src family 
(Kahn et al., 1986). It has been shown that the v-erbA protein 
has lost its hormone-binding potential (Sap et al., 1986), pre- 
sumably as a result of one or several mutations it has accumu- 
lated in its putative ligand-binding domain. It has been also 
/suggested (Edwards et al., 1979) that the growth of human breast 

tumors are correlated to the presence of significant levels of 
. ER. This invention may provide a novel example in which a DMA- 
^ binding protein would again relate to the oncogenic transforma- 
: tion by interfering with the transcriptional regulation of target 
genes, DNA-transfection assays using the native hag cDNA as well 



SUBSTITUTE SHEET 



WO 89/05854 ^ PCT/EP88/01180 

as 'altered' hag genes derived from various HCC can provide im- 
portant information concerning any transforming capacity. 



* * * 
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WHAT IS CLAIMED 15 : 

I. A cloned DNA sequence encoding 'a polypeptide of hao 
gene, wherein the sequence has part or: all of the formula 



ATGTTTGACTGTATGGATCTTCrGTCAOTGAGTCCTGGGCAAATCCTGATTCTACACTGCGAGTCC 
GTCTTCCTGCATGCTCCAGGAGAAACCTCTCAAAGCATGCn^AGTGGATTGACCCAAACCGAATG 

(k^ck:atcg<x:acact(k:tcmtcaattgaaacacagagcaccagctctgac^aactcgtcccaag 
ccccccatctccacttcctccccctcgagtgatcaaaccctgcttcgtctgccaggacaaatcatc 
agggtaccactatggggtcagcgcctgtgagggatgaagggctttttccgcagaagtattcagaag 
aatatgatttacacttgtcaccgagataagaac^tgttattaataaagtcaccaggaatcgatgc 

CAATACTGTCGACTCCAGAAGTGCTTTGAAGTGGGAATG7CCAAAGAATCTGTCAGGAATGAC 

AACAAGAAXWGAAGGAGACTTCGAAGCAAGAATGCACAGAGAGCTATGAAATGACAGCTGAGTTG 

CACGATCTCACAGAGAAGATCCGAAAAGCTCACCAGGAAACTrTCCCTTCACTCTCGCAGCTGGOT 

AAATACACCACGAATTCCAGTGCTGACCATCGAGTCCGACTGGACCTCCGCCTCTGGGACAAATTC 

AGTGAAC^GCCACCAAGTGCAWATTAACATCGTGCAGTTTGCTAAACGTCTCCCTGCTTTCA 

GGCTTGACCATCGCAGACCAAATTACCCTGCTCAAGGCCGCCTGCCrGCACA^TCCTGATTCTTAGA 

ATTTGCACCAGGTATACCCCAGAACAAGACACCATGACTTTCTCAGACGGCCTTACCCTAAATCGA 

ACTCAGATGCACMTGCrGGATTTGCTCCTCTGACTGACCTTGTGTTCACCTTTGCCAACCAGCTC 

CTGCCTTrGGAAATGGATGACACAGAAACAGGCCTTCrCAGTGCCATCTCCTTAATCTC 

CGCCAGGACGTTGAGGAACCGACAAAAGTAGATAAGCTACAAGAACCATTGCTGGAAGCACTAAAA 

ATTTATATCAGAAAAAGACGACCCAGCAAGCCTCACATGTTrCCAAAGATCTTAATGAAAATCACA 

GATCTCCGTAGCATCAGTGCTAMGGTGCAGAGCGTGTAATTACCTTGAAAATGGAAATTCCTGGA 

TCAATGCCACCTCTCATTCAAGAAATGATGGAGAATTCTGAAGGACATGAACCCTTGACCCCAAGT 
f 

TCAAGTGGGAACACAGCAGAGCACAGTCCTAGCATCTCACCCAGCTCAGTGGAAAAG^GTGGGGTC 
AGTCAGTCACCACTCGTGCAATAA, 



SUBSTITUTE SHEET 



WO 89/05854 



PCT/EP88/01180 



and serotypic variant? . thereof , wherein said DNA is in a purifi 

t * * ♦ 

form. ; 

2. DNA sequence as claimed in claim !, which is free 
of human serum proteins components, such as human tissue or 
serum proteins* • • 

3. DNA sequence as claimed in claim 1 or 2, vKerein the 
sequence had any of the following formulae : 

(a) GTCAGGAAIfiACAGGAACAAG AAAAAGAAGGAGACTTCGAAGCAAGAATGC; 

(b) GCTGAGTTGGACCATCTCACAGAGAAGATCCGAj 
( c Y GGGCrCACTCAGTCACCACTCGTGCAA-,; 

(d) AATGACAGGAACAAGAAAAAGAAGG AGACT . j 

<e) ATGTTTGACTGTATGGA'KrrTCrGTCAGTGAGTCCTGGGCAAATCCrCGATTTC 
TACACTGCGAGTCCGTCTTCCTGCATGCTCCAGGAGAAAGCTCTCAAAGCATGC 
TTCAGTGGATTGACCCAAACCGAATGGCAGCATCGGCACACTGCTCAATCA.J 

y 

( f ) 1 CATGMCCCTTGACCCCAAGTTCAAGTGGGAACACAGCAGAGCACACTCCTAGC 
ATCTGACCCAGCTCAGTGGAAAACAGTGGGGTCACTCAGTCACCACTCGTGCAA . 
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'(g)* CCCATGC 
* GAGCTGTTTGAAGGACTGGGATGCCG AGAACGCGAGCGATCCGAGCAGGGTTC 
. ATGTTTGACTGTATGGATGTTCTGTCAGTGAGTCCTGGGCAA^ 
QTCTTCCTGCATGCTCCAGGAGAAAGCTCTCAAAG^ATGCTTCACT 

WG^TCGGCACACTGCTCMTCAAT^ 

CCCCCCATCTCCACrrCCTCCCCCTCGAGTGATCAAACCCTGCTTCGTCTGCCAW 
AGGGTACOCTATCGGGTCAWGCCTGTGAGGGATGAAGGGCTTTTT^ 

aatatgatttacacttgtcaccgagxtaAgaactgtgttattaataaagtcaccaggaatcgat^ 

CAATACTGTCGACTCCAGAAGTGCTTTGAAGTGGGAATGTCCAAAGAATC^ 
MCAAGAAAAAGAAGGAGACTTC6AAGCAAGAATGCACAGAGAGCTATGAAATGACAGCTGAGTTG- 
GACGATCTCACAGAGAAGATCCGAAAAGCTCACCAGGAAACTTTCCCTTCACTCTCGCAG 
XAATACACCACGAATTCCAGTGCTGACCATCGAGTCCGACTGGACCTG<XjCCTC^ - 

AGTGAACTGttCACCAAGTGCATT^^ 

GGCTTGACCATCGCAGACCAAATTACCCTGCTGAAGGCCGCCTGCCTGGACATCCTGATrCTTAGX 
ATTTGCACCAGGTATACCCCAGAACAAGACACCATGACTTTCTCAGXCGGCCTTACCCTAAATCGX 

ACTCAGATGCXCAAtGCTGGATTr<K3TCCTCTGACT 
CTGCCTTTGGAMTGGATGACACAGAAACAGGCCTTCTCAGTGCCATCTGCTTAATCTGTGGAGAC 
CGCCAGGACCrTGAGGAACCGACAAAAGTAGATAAGCrACAAGAACCATTGCTGGAAGCACTAAAA 
ATtTATATCAGAAAAAGACGACCCAGCAAGCCTCACATGTTTCCAAAGATCTTAATGAAAATCACA 
GATCTCCGTAGCATO^GTGCTAAAGGTGCAGAGCGTGTMTTACCTTGAAAATGGAAATTCCTGGA 

TCAATGCCACCTCTCATTCMGAAATGATGGAGAATTCTGAAGGAC^ 

TCAAGTGGGAACACAGCAGAGCACAGTCCTAGCATCTCACCCAGCTCAGTGGAAAACAGTGGGGTC 
AGTCAGTCACCACTCGTGCAATAA 



v. 
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4. A DHA probe consisting essentially of a radio- 
nuclide bonded to the DBA sequence of any of claims 1 to 3. 

5. A hybride duplex molecule consisting essen- 
tially of thJ DNA 'sequence of claim I hydrogen bonded to a 
nucleotide sequence of complementary base sequence. 

6. Hybrid duplex molecule as claimed in claim 9, 
wherein said nucleotide sequence is either a DHA sequence, 

or a S.NA sequence. 

7. A'polypeptide comprising an amino acid 
sequence of hap. protein, wherein the polypeptide contains 
part or all of the amino acid sequence - 

MetPheAspCysMetAspValLeuSerValSerProGlyGlnlleLeuAspPheTyrThrAla 

SerProSerSerCysMetLeuCltiGluLysAlaLeuLysAlaCysPheSerGlyLeuThrGln 

ThrGIuTrpGlnHisArgHisThrAlaGlnSerrieGluThrGlnSerThrSerSerGluGlu 

LeuValProSerProProSerProLeuProProProArgValTyrLysProCysPheValCys 

GlnAspLysSerSerGlyTyrHisTyrGlyValSerAlaCysGluGlyCysLysGiyPhePhe 

ArgArgSerneGlnLysAsnMetlleTyrThrCysHisArgAspLysAsnCysVallleAsn 

LysVaWhrArgAsnArgCysGlnTyrCysArgteuGlnLysCysPheGluValGlyMetSer 

LysGluSerValArgAsnAspArgAsnLysLysLysLysGluThrSerLysGlnGluCysThr 

GluSerTyrGluMetThrAlaGluLeuAspAspLeuThrGluLysIleArgLysAlaHisGln 

GluThrPheProSerLeuCysGlnLeuGlyLysTyrThrThrAsnSerSerAlaAspHisArg 
VaUrgLeuAspLeuOlyLeuTrpAspLysPheSerGluLeuAlaThrLisCysIleUcLys 

lleValGlvPheAlatysArgLeuProGlyPheThrGlyLauThflieAUAspGlntUThr 

'teuLeoLysAlaAlaCysLeuAspIleLeuneLauArglleCysThrArgTyrThrProGlu 

• GlnAspThrMetThrPheSerAspGlyLeuThrLeuAsnArgThrGlnH«tKisAsnAlaGly 

PheGlyProLeuThrAspLeuValPheThrPheAlaAsftGlnteuLeuProLeuGluMetAsp 

AspThrGluThrGlyLeuLeuSerAlalleCysLauIleCysClyAspArgGlnAspLeuGlu 

GluProThrLysValAspt»ysL«uGlnGluProL«uL«uGluAlaLeuLys I leTyr I leArg 

LysArgArgProSerLysProKisMetPheProLystleLeuMetLysIUThrAspLeuArg 

SarlleSerAlaLysGlyAlaGluArgVallUThrLeuLysMetGlulieProGlySerMet 

ProProLeuneGlnGluMetMetGluAsnSerGluGlyHlsGluProLeuThrProSerSer 

SerGlyAsnThrAlaGluHisSerProSerlleSerProSerSerValGluAsnSerGlyVal 

SerGlnSerProLeuValGln, 
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and serotypic variants and fragments thereof, wherein said 
polypeptide is free from human serum proteins, virus, viral 
protein, human tissue, and human tissue components. 

8. The polypeptide of claim 1 which displays 
the retinoic acid and DNA binding properties of RAR-beta. 

9. Polypeptide as claimed in claim 7 or 8, 
which is free from human, blood-derived protein. 

10. A polypeptide as claimed in claim 7, wherein 
the polypeptide comprises a peptide fragment having any 

of the following amino acid sequence : 

(a) GlnHisArgHisThrAlaGlnSerl leGluThrGlnSerThrS.erSerGluGlu 

LeuValProSerProProSerProLeuProProProArgValTyrLysProCysPheValCys 

GlnAspLysSerSerGlyTyrHisTyrGlyValSerAlaCysGluGl7CysLysGlyPhe?h« 

ArgArgSerlleG^lnLysAsnMetlleTyrThrCysHisArgAspLysAsaCysVall leAsn 

VtysValThrArgAsnA^ 
tysCluSerVaUrgAsnAspArgAsnLysLysLysLysGluThrSerLysClnCiuCysThr 

' oiuSerTyrGluMetThrAlaGluLeuAspAspLeuThrGluLysIleArgLysAtaKisGi^ 
GluThrPheProSer LeuCys j 

(b) ValArgAsnAspArgAsnLysLysLysLysGluThrSerLysGlnGluCys (peptide 1) 
and serotypic variants thereof .]; 

(c) AsnAspArgAsnLysLysLysLysGluThrCys (peptide 2); 
and serotypic variants thereof • 

U) cysGlyValSarGlnSerProleuValGln (peptide 3)» 
and serotypic variants thereof;* 

( e ) AlaGiuLeuAspAspLeuThrGluLys I leArg J 

and serotypic variants thereof ; 
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(f ) MetPheAspCysMetAspValLeuSerValSerProGlyCinlleLeuAspPheTyrThr 
AlaSerProSerSerCysMetLeuGlnGluLysAlaLeuLysAlaCysPheSerGlyLeu 
ThrGlnThrGluTrpGlnHis ArgHisThrAI aGlnSer * / 

*- 8 * HtsGluProLeuThrProSerSerSerGlyAsaThrAlaGluHisSerProSer 
IUSerProSerSerValGluAsnSerGlyValSerGlnSerProLeuValGln . 



II. A process for selecting, from a group of nucleo- 
tide sequences ^ a nucleotide sequence, e.g. a DNA sequence 
or a RNA sequence, said nucleotide sequence being preferably 
labelled, e.g. by a radionuclide, said nucleotide sequence 
coding for hap protein or a portion thereof, said process 
comprising the step of determining which of said nucleotide 
sequences hybridises to a DBA sequence as claimed in claim 1. 

12. Process as claimed in claim 11, wherein said 
nucleotide sequence is selected by Southern blot technique, 
under high stringency conditions performed as follows : 

24 h prehybridizat ion in 50 Z formamide, Sx- Denhardt , 
5x SSC, 300 up/ml denatured salmon sperm DNA, at 40°C ; 

48 h hybridization with 35 Z formamide, 5x Denhardt, 5x SSC, 

6 32 
10 Z Dextran sulfate, 2.10 cpm/ml denatured p labelled 

DNA probe (specific activity 5.10 cpm/ug) ; washes in 

O.Ix.SSC, 0.1 SDS, 55°C for 30 minutes. 

13. Process as claimed in claim 11, wherein said 
nucleotide sequence is selected by Northern blot technique 
according to Maniatics et al. (1982), 

14. A recombinant vector comprising lambda- 
NM1149 having into which has been inserted the DNA sequence 
as claimed in any of claims 1 TO 3 . 

15. Plasmid pCOD20. 

16. An E_. coli bacterial culture in a purified 
form, wherein the culture comprises E. coli cells containing 
DNA, wherein a portion of said DNA comprises the DNA 
sequence as claimed in claim 1* 
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,7. J X method' for assaying a fluid for the presence of an 
agonist or antagonist to - pinole acid receptor Mft-I. vherein 

the method comprises 

(A) providing an aqueous solution containing a known con- 
centration of the proteinaceous receptor as claimed in claim 13, 

(8) incubating the receptor vith the fluid suspected of 
, containing the agonist or antagonist under conditions sufficient 
to bind the receptor to the agonist or antagonist, and 

' (C) determining whether there is change in concentra- 
tion of the proteinaceous receptor in the aqueous solution. 

18. Method as claimed in claim 17, wherein the 
receptor and the agonist or antagonist form a complex. 

19. Method as claimed in claim 18, wherein a cross- 
linking agent is present in an amount sufficient to inhibit 
dissociation of the receptor and the agonist or antagonist: 

20. ' A recombinant DMA molecule comprising a DNA sequence of 
coding for a retinoic acid receptor, said DNA sequence coding on 
expression in a unicellular host for a polypeptide displaying the 
retinoic acid and DNA binding properties of BAR- 8 and being oper- 
atively linked to an expression control sequence in said DMA mol- 
ecule. 

21. The recombinant DNA molecule of claim 20 wherein 
the DNA sequence is any one of claims 1 to 3- 

22. Plasmid pROHAP. 

23- Bacterial culture as claimed in claim 16, wherein 
said cells are comprised of E. coli strain DHScXF 1 . 
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